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Fas*!''!? ---Fast Pa'iruise Comparison of Sequences 
Release 5.4 

Results file celsa—1—align.res made by alexk on 


eA. H 

L<Nc.t 'ciw s»e cx v ^ 

Thu 25 Feb 93 fo:46:11-PST. 


Query sequence being compared 
Number of sequences searchedi 
Number of scores above cutoff 


CELSA-i (1-6) 
50 
50 


Results of the initial comparison of CELSA-1 (1-6) with: 
File J celsa-l.pep 


100 - 


N 

U 50- 

M 

B 

E 

R 

0 

F 10- 


* 


# 


S 

E 

Q 


u 

E 


N 

C 

E 

S 0 — 

I I 

SCORE 0| 

STDEV -5 


1 


1 


-3 


# 


It III 

21 3.|3 

-2 0 


I I 

4| 


5 


6 


PARAMETERS 


Si milarity matrix Unitary 
Mismatch penalty 5 
Gap penalty 1.00 
Gap size penalty 0.26 
Cutoff score 0 
Randomization group 0 


K-tuple • 2 

Joining penalty 20 

W i n d o ui s i z e 5 


SEARCH STATISTICS 


Sco res 


Mean 

4 

M e d i a n 

D 

3 i a n d a r d D e v 
0.65 

Tines : 


CPU 

00;00:00.0 3 


T 0 t- 3 1 Eldp%e 
00:00;01„00 

Number 

of 

re sidue s: 

3 03 14 


N u n b e r 

of 

s e q u ences 5 e a r c h e d i 

50 


Nunber 

of 

s c 0 res ab 0 ve cutoff 5 

50 



i a t i o n 

d 


V 




oiyriiTii-driut? x s caLUULdoeu Lidiyu 


1 j r i ifuudL bLUf’tf . 


4 100“/. similar sequences to the query sequence were found; 


Sequence Name 


Description 


I nit. Opt. 

Length Score Score 


Sig. Frame 


1. 

GNWVNE 

Genome polyprotein 

- Tick-bor 

34 14 


6 

6 

3.09 

0 

2. 

P0LG_TEEVW 

GENOME P0LYPR0TEIN 

(CONTAINS; 

3414 


6 

6 

3.09 

0 

3. 

GNWVTB 

Genome polyprotein 

- T i ck-bor 

3412 


b 

6 

3.09 

0 

4. 

P0LG_TBEVS 

GENOME P0LYPR0TEIN 

(CONTAINS; 

3412 


6 

6 

3.09 

0 

The list of other 

best scores is ; 













I n i t. 

Opt. 



Sequence Name 

Description 


Length 

Sc 

o r e 

Score 

Sig. Frame 



***** 1 standard 

deviation above mean 





5. 

A39246 

*Carboxypeptidase 

A precursor 

417 


5 

5 

1.55 

0 

6. 

CBPC_HUMAN 

MAST.CELL CARB0XYPEPTIDASE A 

417 


5 

5 

1.55 

0 

7. 

S04912 

*yopA protein - Yersinia ente 

455 


5 

5 

1 .55 

0 

8. 

A3993S 

^Phosphotransferase system en 

460 


5 

5 

1 .55 

0 

9. 

PT2S_BACSU 

PHOSPHOTRANSFERASE 

ENZYME II* 

460 


5 

5 

1.55 

0 

10. 

WMBE71 

UL13 protein - Human herpesvi 

' 518 


5 

5 

1 .55 

. 0 

1 1 . 

D30083 

*Gene UL13 protein 

- Human he 

518 


5 

5 

1.55 

0 

12. 

KR2_HSV11 

PROBABLE SERINE/THREONINE-PRO 

518 


5 

5 

1.55 

0 

13. 

DNBEV1 

DNA-binding protein - Human h 

1196 


5 

5 

1.55 

0 

14. 

DNBEKS 

DNA-binding protein - Human h 

1196 


5 

5 

1.55 

0 

15. 

DNBEHF 

DNA-binding protein - Human h 

1 196 


5 

5 

1.55 

0 

16. 

B300S5 

*Gene UL29 protein 

(major DNA 

1196 


5 

5 

1.55 

0 

17. 

DNBI„HSV11 

MAJOR DNA-BINDING 

PROTEIN. 

1 196 


5 

5 

1.55 

0 

IS. 

DNBI_HSV1F 

MAJOR DNA-BINDING 

PROTEIN. 

1 196 


5 

5 

1.55 

0 

19. 

DNBI_HS V1K 

MAJOR DNA-BINDING 

PROTEIN. 

1196 


5 

5 

1.55 

0 

20. 

S19926 

•^Hypothetical protein - Red c 

1864 


5 

5 

1.55 

0 


1. CELSA-1 
GNWVNE 

ENTRY 

TITLE 

INCLUDES 


DATE 

PLACEMENT 
SOURCE 
ACCESSION 
REFERENCE 
#Authors 
^Journal 
#Title 


#Reference 

# A c c e s s i o n 
ttMolecule- 
#Residues 

REFERENCE 

# A u t h o r s 


Genome polyprotein - Tick-borne encephalitis virus 
GNWVNE #Type Protein 

Genome polyprotein - Tick-borne encephalitis virus 
(subtype Western* strain Neudoerfl) 
capsid protein C\ membrane protein M\ envelope 
protein E\ nonstructural protein NS1\ 
nonstructura1 protein NS2a\ nonstructural protein 
NS2b\ nonstructural protein N53\ nonstructural 
protein NS4a\ nonstructural protein NS4b\ 
nonstructural protein NS5 

31-Dec-1939 ^Sequence 30-Jun-1991 #Text 30-Sep-1992 
2169.0 4.0 1.0 2.0 1.0 

tick-borne encephalitis virus 
A3 1052\ A32596 

Mandl C.W. / Heinz F.X.* Kunz C. 

Virology (1988) 166;197-205 

Sequence of the structural proteins of iick-borno 
enc eph a 1itis virus (Western subtype) and 
comparative analysis with other- flavivi ruses, 
-number A31052 
A3 1052 

type genomic RNA 
1-779 <MAN1> 

Mandl C.W.* Heinz F.X,* Stoeck1 EKunz C. 





“fr i i b l c urii'-'ric a T'-jUcni. c u i u j. v * 'uuri itr riiLtrpua i i b a d v j. i ' u => 

(Western subtype) and comparative analysis of 
nonstructural proteins with other flaviviruses. 
#Reference-number A32596 
^Accession A32596 

#Molecule-type genomic RNA 
^Residues 767-3414 <MAN2> 

COMMENT This virus is a member of the family F1aviviridae. 

SUPERFAMILY #Name yellow fever virus genome polyprotein 

KEYWORDS capsid protein\ envelope proteinX glycoproteinX 

membrane proteinX nonstructural proteinX 
polyprotein 


FEATURE 

2-116 ^Protein capsid protein C <CPC>\ 

117-280 #Protein membrane protein M precursor 

<MPP>\ 

117-205 #Domain signal sequence <SIG>X 

206-280 #Protein membrane protein M <MPM>\ 

*246-264 ^Domain transmembrane <TM1>\ 

281-776 ^Protein envelope protein E <EPE>X 

738-751 #Domain transmembrane <TM2>X 

777-1128 #Protein nonstructural protein NS1 <NS1>\ 

1129-1358 #Protein nonstructural protein NS2a 

<N2A>\ 

1359-1489 #Protein nonstructural protein NS2b 

<N2B>\ 

1490-2110 #Protein nonstructural protein NS3 <NS3>\ 

2111-2259 #Protein nonstructural protein NS4a 

<N4A> X 

2260-2511 #Protein nonstructural protein NS4b 

< N 4 B > X 

2512-3414 #Protein nonstructural protein NS5 <NS5>X 

144i434 , 641,753,861, 

983,999,1649,1988, 

2044,2447,2529,2686, 

2726 #Binding-site carbohydrate (Asn) 

(covalent) (predicted) 

SUMMARY #Molecular-ueight 378383 #Length 3414 ^Checksum 1083 

SEQUENCE 

COMMENT Retrieved by ale>:k on Thu 25 Feb 93 10?2l:l7-PST using FastDB 


Ini t i al- Score 
Residue Identity 
Gaps 


6 Opti m i red Score = 6 Significance = 3-09 

10.0X Matches = 6 Mismatches = 0 

0 Conservative Substitutions = 0 


RRVTTASAAQRRGRVGRQDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFYGPEQDKMPEV 
1940 1950 1960 1970 1980 1990 2000 2010 

X X 
WHVAAN 


AGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANV5SVTDRSWTWEGPDRDAVDEASGDLVTFRSPNGAERT 
2020 2030 2040 X 2050 2060 2070 2080 

LRPVWKDARMFKEGRDIKEFVAYASGRRSFGDVLTGMSGVPELLRHRCVSALDVFYTLMHEE 
2090 2100 2110 2120 2130 2140 

2- CELSA-1 (1-6) 

PDLG TBEVW .GENOME POLYPROTEIN* (CONTAINS: CAPSID PROTEIN C (CO 


ID 

PQLG,TBEVW 

STANDARD? 

AC 

P14336? 


DT 

01-JAN-1990 

(REL * 13, CR 

DT 

01-NOV-1991 

(REL. 20, LA 


PRT; . 3414 AA. 



DC. 


uciiunc ruLiriMjicin uciraiu rnuitiN u rruiitiwj i i’ihikia 


DE PROTEIN (ENVELOPE PROTEIN M)? MAJOR ENVELOPE PROTEIN E; NONSTRUCTURAL 
DE PROTEINS NS1, NS2A, NS2B, NS4A AND N54B? HELI CASE (NS3) ? RNA-DIRECTED 
DE RNA POLYMERASE (EC 2.7.7.43) (NS5)) . 

OS TICK-BORNE ENCEPHALITIS VIRUS (WESTERN SUBTYPE) (TBEV). 

OC VIRIDAE? SS-RNA ENVELOPED VIRUSES? POSITIVE-STRAND? FLAVIVIRIDAE. 

RN [ 1 3 

RP SEQUENCE OF 1-779 FROM N.A. 

RC STRAIN=NEUDOERFL? 

RM 33322370 

RA MANDL C.W., HEINZ F.X., KUNZ C.? 

RL VIROLOGY 166? 197-205(1983) . 

RN C23 

RP SEQUENCE OF 767-3414 FROM N.A. 

RC STRAIN=NEUDOERFL? 

RM 90051080 

RA MANDL C.W., HEINZ F.X., STOECKL E., KUNZ C.? 

RL VIROLOGY 173:291-301(1939). 

CC -!- FUNCTION: THE SMALL PROTEINS N52A, N52B, NS4A AND NS4B ARE 
CC HYDROPHOBIC, SUGGESTING A POSSIBLE MEMBRANE-RELATED FUNCTION. 

CC NS3 AND NS5 MAY PLAY A ROLE IN THE VIRAL RNA REPLICATION. 

CC -!- SUBUNIT: THE VIRION OF THIS VIRUS IS A NUCLEOCAPSID COVERED BY A 
CC LIPOPROTEIN ENVELOPE. THE ENVELOPE CONSISTS OF TWO PROTEINS? 

CC PROTEIN M AND GLYCOPROTEIN E. THE NUCLEOCAPSID IS A COMPLEX OF 

CC PROTEIN C AND MRNA. 

DR EMBL ? M21498 ? TOGTBESP. 

DR EMBL? M33668? TBEGNE. 

DR PIR? A31052 ? GNWVNE. 

KW POLYPROTEIN? GLYCOPROTEIN? RNA-DIRECTED RNA POLYMERASE? CORE PROTEIN? 
KW COAT PROTEIN? ENVELOPE PROTEIN? HELI CASE ? ATP-BINDING? TRANSMEMBRANE? 
KW NONSTRUCTURAL PROTEIN. 

FT INIT_MET 1 1 REMOVED FROM CAPSID PROTEIN C BY THE 

FT CELLULAR AM INOPEPTIDASE. 


FT 

FT 

CHAIN 

PROPER 

2 

113 

112 

205 

CAPSID PROTEIN C. 

FT 

CHAIN 

206 

230 

ENVELOPE GLYCOPROTEIN M. 

FT 

CHAIN 

231 

776 

MAJOR ENVELOPE PROTEIN E. 

FT 

CHAIN 

777 

1 123 

NONSTRUCTURAL PROTEIN NS1. 

FT 

CHAIN 

1129 

1353 

NONSTRUCTURAL PROTEIN NS2A. 

FT 

CHAIN 

1359 

1439 

NONSTRUCTURAL PROTEIN NS2B. 

FT 

CHAIN 

1490 

2110 

HELI CASE ( NS3) . 

FT 

CHAIN 

211 1 

2259 

NONSTRUCTURAL PROTEIN NS4A. 

FT 

CHAIN 

2260 

2511 

NONSTRUCTURAL PROTEIN NS4B. 

FT 

CHAIN 

2512 

3414 

RNA-DIRECTED RNA POLYMERASE (NS5). 

FT 

TRANSMEM 

101 

112 

HYDROPHOBIC SIGNAL SEQUENCE (POTENTIAL) 

FT 

TRANSMEM 

247 

259 

POTENTIAL. 

FT 

TRANSMEM 

266 

230 

POTENTIAL. 

FT 

TRANSMEM 

733 

751 

POTENTIAL. 

FT 

CARBOHYD 

144 

144 

POTENTIAL. 

FT 

CARBOHYD 

434 

434 

POTENTIAL. 

FT 

CARBOHYD 

361 

361 

POTENTIAL. 

FT 

CARBOHYD 

933 

933 

POTENTIAL. 

FT 

CARBOHYD 

999 

999 

POTENTIAL. 

FT 

CARBOHYD 

2447 

2447 

POTENTIAL. 

SQ 

SEQUENCE 

3414 

AA } 373379 

MW } 2 .052432E+07 CN? 

CC 

- ! - Retrieved by 

alexk on Thu 25 Feb 93 10:21?16-PST using FastDB 

Initial Score 

= 

6 Optimi 

zed Score = 6 Significance = 3.09 

Residue Identity 
Gaps 

~ 1007. Matches = 6 Mismatches = 0 

= 0 Conservative Substitutions = 0 


RRVTTASAAQRRGRVGRQDGRTDEYIYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFYGPEQDKMPEV 
1940 1950 1960 1970 1930 1990 2000 2010 



nun 

AGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTDRSWTWEGPDRDAVDEASGDLVTFRSPNGAERT 
2020 2030 2040 X 2050 2060 2070 2080 

LRPVWKDARMFKEGRDIKEFVAYASGRRSFGDVLTGMSGVPELLRHRCVSALDVFYTLMHEE 
2090 2100 2110 2120 2130 2140 


3. CELSA-1 
GNWVTB 

ENTRY. 

TITLE 

INCLUDES 


( 1 - 6 ) 


DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 

^Authors 

^Journal 

#Title 


Genome polyprotein - Tick-borne encephalitis virus 

GNWVTB #Type Protein 

Genome polyprotein - Tic 
(strain Sofjin) 
capsid protein C\ envelc 
protein M\ major envoi 
protein NS1\ nonstruct 
nonstructural protein NS2 


protein NS4b\ nonstruc 
31-Mar-1991 ^Sequence 33 
2169-0 4.0 1-0 

tick-borne encephalitis 
A33776\ S06414 


bo 

rne 

enc 

ephaliti 

s 

v 

i rus 

P 

rote 

i n 

prM\ env 

e 

1 o 

pe 

e 

pr ot 

ein 

E\ nons 

t 

ru 

ctur al 

al 

pro 

te i 

n NS2a\ 




2b 

\ no 

ns t 

ructural 


pr 

o t e i n 

i n 

NS4 

a\ 

nonstruc 

t 

ur 

al 

r a 

1 pr 

ote 

in NS5 




ar 

-199 

1 #Text 30- 

Jun 

-1992 

0 

1 

.0 





ru 

s 







Pletnev A.G., Yamshchikov V.F., Blinov V.M. 
Virology (1990) 174:250-263 

Nucleotide sequence of the genome and 
acid sequence of the polyprotein of 
encephalitis virus. 

#Reference-number A33776 
^Accession A33776 
#Molecule-type genomic RNA 
^Residues 1-3412 <PLE> 

#Cross-reference GB:X07755 
COMMENT 
SUPERFAMILY 
KEYWORDS 


complete amino 
tick-borne 


FEATURE 


This virus is a member of the family F1aviviridae. 
#Name yellow fever virus genome polyprotein 
capsid proteinX envelope proteinX g1ycoprotein\ 
nonstructural protein\ polyprotein 


2-112 

#Protein 

capsid protein C < 

113-205 

ftProtein 

envelope protein p 

206-280 

ttProtein 

envelope protein M 

281-776 

#Protein 

major envelope pro 

777-1190 

#Protein 

n o n s t r u c t u r a 1 p r o t 

1191-1358 

#Protein 
< N 2 A > \ 

nonstructural prot 

1359-1489 

ttProtein 
< N 2 B > \ 

nonstructural prot 

1490-2110 

#Protein 

nonstructural prot 

2111-2259 

ttProtein 
< N 4 A > \ 

nonstructural prot 

2260-2510 

#Protein 
< N 4 B > \ 

nonstructural prot 

2511-3412 

# P r o t e i n 

nonstructural prot 


144 ,434 ,641,753,361 , 
983,999,1228,1649, 
1983,2044,2052,2447, 
2466,2685,2725 


CPC>\ 
rM <PRM>\ 
<PMM>\ 

teir. E <PPE>\ 
ein NS 1 <NS1>\ 
ein NS2a 

ein NS 2 b 

ein NS3 <N53>\ 
ein NS4a 

ein NS4b 

ein NS5 <NS5>\ 


( A s n ) 


SUMMARY 
SEQUENCE 
COMMENT 

Initial'Score 

■JTi— ■■ - -l,.,.,- T_;_i_; 


#Binding-site carbohydrate 
(covalent) (predicted) 

#Molecular-ueigbt 377979 #Length 3412 ^Checksum 7007 

Retrieved by al.exk on Thu 25 Feb 93 10 :21 : 17-PST using 

6 0 p t i m i zed Score - 6 Significance = 3,09 


Fast-DB 



j dfJ i 


U 


^urib^rv dL a vtf ouut>ui Lunyris 


RRVTTASAAQRRGRVGRQEGRTDEYIYSGGCDDDDSGLVQWKEAQILLDNITTLRGPVATFYGPEQDKMPEV 
1940 1950 1960 1970 1930 1990 2000 2010 

X X 
WHVAAN 
I 11 1 I 1 

AGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANV5SVTSRNWTWEGFEENTVDEANGDLVTFRSPNGAERT 
2020 2030 2040 X 2050 2060 2070 2030 

LRPVWRDARMF'REGRDIREFVAYASGRRSFGDVLSGMSGVPELLRHRCVSAMDVFYTLMHEE 
2090 2100 2110 2120 2130 2140 


4. CELSA-1 (1-6) 

POLG_TBEVS GENOME POLYPROTEIN (CONTAINS; CAPSID PROTEIN C (CO 

ID POLG_TBEVS STANDARD? PRT ? 3412 AA. 

AC P07720 ? P07721? 

DT 01-APR-1983 <REL. 07, CREATED) 

DT 01-MAY-1991 (REL. 18r LAST SEQUENCE UPDATE) 

DT 01-AUG-1992 (REL. 23, LAST ANNOTATION UPDATE) 

DE GENOME POLYPROTEIN (CONTAINS; CAPSID PROTEIN C (CORE PROTEIN); MATRIX 
DE PROTEIN (ENVELOPE PROTEIN M)? MAJOR ENVELOPE PROTEIN E? NONSTRUCTURAL 
DE PROTEINS NS1, NS2A, NS2B, NS4A AND NS4B ? HELI CASE (NS3) ? RNA-DIRECTED 
DE RNA POLYMERASE (EC 2.7.7.43) (NS5) ) . 

OS TICK-BORNE ENCEPHALITIS VIRUS (STRAIN SOFJIN) (TBEV). 

OC VIRIDAE ? SS-RNA ENVELOPED VIRUSES; POSITIVE-STRAND? FLAVIV IRI DAE. 

RN C 1 3 

RP SEQUENCE FROM N.A. 

RM 90101331 

RA PLETNEV A.G., YAMSHCHIKOV V.F., BLINOV V.M.? 

RL VIROLOGY 174:250-263(1990). 

RN C23 

RP SEQUENCE OF 1-1190 FROM N.A. 

RM 38319933 

RA YAMSHCHIKOV V.F., PLETNEV A.G.? 

RL NUCLEIC ACIDS RES. 16:7750-7750(1983). 

RN C 3 3 

RP SEQUENCE OF 1-683 AND 753-1002 FROM N.A. 

RM 36220766 

RA PLETNEV A.G., YAMSHCHIKOV V.F., BLINOV V.M.? 

RL FEBS LETT. 200:317-321(1936). 

CC -!- FUNCTION: THE SMALL PROTEINS NS2A, NS2B, NS4A AND NS4B ARE 
CC HYDROPHOBIC, SUGGESTING A POSSIBLE MEMBRANE-RELATED FUNCTION. 

CC NS3 AND NS5 MAY PLAY A ROLE IN THE VIRAL RNA REPLICATION. 

CC -!- SUBUNIT: THE VIRION OF THIS VIRUS IS A NUCLEOCAPSID COVERED BY A 
■ CC LIPOPROTEIN ENVELOPE. THE ENVELOPE CONSISTS OF TWO PROTEINS: 

CC PROTEIN M AND GLYCOPROTEIN E. THE NUCLEOCAPSID IS A COMPLEX OF 

CC PROTEIN C AND MRNA. 

CC -!- THE NONSTRUCTURAL PROTEINS NS1 PRESENTS TWO ALTERNATIVE CLEAVAGE 
CC SITES FOR ITS C-TERMINUS, WHICH MAY DEFINE A SOLUBLE OR A 

CC MEMBRANE-BOUND FORM OF NS1. 

DR EMBL ? X07755 ? TBEV1. 

DR EMBL? X03870? T0TBEV1. 

DR EMBL? X03871? T0TBEV2. 

DR PIR; A33776 ? GNWVTB. 

DR PIR? A24055 ? GNWVTE. 

DR PIR? B24055 ? MNWVTE. 

KW POLYPROTEIN? GLYCOPROTEIN; RNA-DIRECTED RNA POLYMERASE; CORE PROTEIN? 

KW COAT PROTEI-N? ENVELOPE PROTEIN? HEL I CASE ? ATP-BINDING? TRANSMEMBRANE? 
KW NONSTRUCTURAL PROTEIN. 

FT INIT MET 1 1 

FT 


REMOVED FROM CAPSID PROTEIN C BY THE 
CELLULAR AM INOPEPTIDABE . 


1 1 

i nui i_ i 

j. * <_> 

l~u J 



FT 

CHAIN 

206 

280 

ENVELOPE GLYCOPROTEIN 

M. 

FT 

CHAIN 

231 

776 

MAJOR ENVELOPE PROTEIN 

E. 

FT 

CHAIN 

777 

71128 

NONSTRUCTURAL PROTEIN 

NS 1 (OR 1190) . 

FT 

CHAIN 

71129 

1358 

NONSTRUCTURAL PROTEIN 

NS2A (OR 1191). 

FT 

CHAIN 

1359 

1489 

NONSTRUCTURAL PROTEIN 

NS2E . 

FT 

CHAIN 

1490 

2110 

HELI CASE (NS3) - 


FT 

CHAIN 

211 1 

2259 

NONSTRUCTURAL PROTEIN 

NS4A . 

FT 

CHAIN 

2260 

2510 

NONSTRUCTURAL PROTEIN 

NS4B . 

FT 

CHAIN 

2511 

3412 

RNA-DIRECTED RNA POLYMERASE (NS5). 

FT 

TRANSMEM 

101 

112 

HYDROPHOBIC SIGNAL SEQUENCE (POTENTIAL). 

FT 

TRANSMEM 

247 

259 

POTENTIAL. 


FT 

TRANSMEM 

266 

280 

POTENTIAL. 


FT 

TRANSMEM 

738 

751 

POTENTIAL. 


FT 

CARBOHYD 

144 

144 

POTENTIAL. 


FT 

CARBOHYD 

434 

434 



FT 

CARBOHYD 

861 

861 

POTENTIAL. 


FT 

CARBOHYD 

983 

983 

POTENTIAL. 


FT 

CARBOHYD 

999 

999 

POTENTIAL. 


FT 

CARBOHYD 

1228 

1228 

POTENTIAL. 


FT 

CARBOHYD 

2447 

2447 

POTENTIAL. 


FT 

CARBOHYD 

2466 

2466 

POTENTIAL. 


FT 

CONFLICT 

381 

381 

W -> S (IN REF. 3). 


FT 

CONFLICT 

850 

850 

E -> D (IN REF. 3). 


SQ 

SEQUENCE 

3412 

AA y 377976 

MW? 2.06S228E+07 CN? 



CC -!- Retrieved by alexk on Thu 25 Feb 93 10.’21:17-PST using FastDB 

Initial Score = 6 Optimized Score = 6 Significance = 3.09 

Residue Identity = 1007. Matches = 6 Mismatches = 0 

Gaps = 0 Conservative Substitut ions =0 

RRVTTASAAQRRGRVGRQEG'RTDEY IYSGQCDDDDSGLVQWKEAQILLDNITTLRGPVATFYGPEQDKMPEV 
1940 1950 1960 1970 1980 1990 2000 2010 

X X 
WHVAAN 

I I I I I I 

AGHFRLTEEKRKHFRHLLTHCDFTPWLAWHVAANVSSVTSRNWTWEGPEENTVDEANGDLVTFRSPNGAERT 
2020 2030 2040 X 2050 2060 2070 2080 

LRPVWRDARMFREGRDIREFVAYASGRRSFGDVLSGMSGVPELLRHRCVSAMDVFYTLMHEE 
2090 2100 2110 2120 2130 2140 





0| |0 Inte1liGenetics 

> Q < 

FastDB — Fast Pairwise Comparison of Sequences 
Releases.4 

Results file celsa-13~align.res made by alexk on Thu 25 Feb 93 10*4752 


Query sequence being compared* CELSA-13 (1-5) 

Number of sequences searched* 50 

Number of scores above cutoff* 50 

Results of the initial comparison of CELSA-13 (1-5) with* 
File * celsa-13.pep 


100 - 


N 

U 50- 

M 

B 

E 

R 




0 

F 10- 


«■ 


S 

E 

Q 

U 

E 

N 

C 

E 

S 


5- 


0 


I I 

SCORE 0| 
STDEV -7 


I I I 

1 H 

-5 - 


p 


i 


-2 



3 


I I 

4 4 


5 


PARAMETERS 


Similarity matrix 

Unitary 

K-tuple 

2 

Mismatch penalty 

5 

Joining penalty 

20 

Gap penalty 

1.00 

Windou size 

5 

Gap size penalty 

0.26 



Cutoff score 

0 



Randomization group 

0 




SEARCH 

STATISTICS 


Scores* 

Mean 

Median- Standard Deviation 


4 

5 0.42 


Times* 

CPU 

Total Elapsed 



OJ 

o 

o 

o 

o 

o 

o 

o 

b 

o 

o 

o 

o 

o 

o 


Number of residues* 


15270 


Number of sequences 

searched * 

50‘ 


Number of scores abo 

> v e cutoff: 

50 



7-PST . 




<j i yi i i i j. u ai ilc 


x s> 


Ld llu idutfu udb^u <-* r i x r i x o x d l bujfd . 


11 1007- similar sequences to the query sequence were found: 


I nit. Opt - 


Sequence Name 

Description 

Length 

Score 

Score 

Sig. Fr 

ame 

1 

. A37113 

^Ryanodine receptor* cardiac 

4969 

5 

5 

2.39 

0 

2 

. BVFFSL 

sol protein* large splice for 

1597 

5 

5 

2.39 

0 

3 

. S0L_DR0ME 

SMALL OPTIC LOBES PROTEIN. 

1597 

5 

5 

2.39 

0 

4 

. A3 1354 

Cytochrome P450 51 -lanosterol 

523 

5 

5 

2.39 

0 

5 

. CPS 1_CANTR 

CYTOCHROME P450 LI (P450-L1A1 

523 

5 

5 

2.39 

0 

6 

. STP1_ARATH 

GLUCOSE TRANSPORTER (SUGAR CA 

522 

5 

5 

2.39 

0 

7 

. S14627 

#Glucose transport protein - 

522 

5 

5 

2.39 

0 

3 

. S12042 

#Sugar transport protein STP1 

522 

5 

5 

2.39 

0 

9 

. B35901 

^Calcium channel alpha-1 chai 

164 

5 

5 

2.39 

0 

10 

. JQ0700 

Hypothetical 11K protein (mmo 

103 

5 

5 

2.39 

0 

1 1 

. YMM0_METCA 

HYPOTHETICAL 11.9 KD PROTEIN 

103 

5 

5 

2.39 

0 

The 

list of other 

best scores is: 










I n i t. 

Opt. 



Sequence Name 

Description 

Length 

Score 

Score 

Sig. Frame 

12 

. C24735 

Glutathione transferase* 2-2 

9 

4 

4 

0.00 

0 

13 

. B24735 

Glutathione transferase* 1-2 

13 

4 

4 

0.00 

0 

14 

. D24735 

Glutathione transferase* 2-2 

19 

4 

4 

0.00 

0 

15 

. A24735 

Glutathione transferase* 1-1 

26 

■ 4 

4 

0.00 

0 

16 

. S21273 

•fcGlutathione transferase chai 

23 

4 

4 

0.00 

0 

17 

. S09535 

^Glutathione transferase - Ra 

31 

4 

4 

0.00 

0 

13 

. S03358 

^Glutathione transferase - Ra 

31 

4 

4 

0.00 

0 

19 

. JQ0099 ' 

Hypothetical 7K protein - Pap 

63 

4 

4 

0.00 

0 

20 

. V07K_PMV 

7 KD PROTEIN (0RF 4). 

63 

4 

4 

0.00 

0 


1. CELSA-13 (1-5) 

A37113 ^Ryanodine receptor* cardiac muscle - Rabbit 

ENTRY A37113 #Type Protein 

TITLE ^Ryanodine receptor'* cardiac muscle - Rabbit 

DATE 12-Feb-1991 ^Sequence 12-Feb-1991 #Text 12~Feb-1991 

PLACEMENT 0-0 0.0 0.0 0.0 0.0 

COMMENT #This entry is not verified. 

SOURCE Oryctolagus cuniculus #Common-name domestic rabbit 

REFERENCE 

^Authors Otsu K.* Willard H.F.* Khanna V.K.* Zorzato F.* 

Green N.M.* MacLennan D.H. 

Journal J. Biol. Chem. (1990) 265:13472-13433 

#Title Molecular cloning of cDNA encoding the Ca(2+). 

release channel (ryanodine receptor) of rabbit 
cardiac muscle sarcoplasmic reticulum. 

#Reference-number A37113 
#Accession A37113 

SUMMARY #Mo1 ecu 1ar-ueight 565069 #Length 4969 ^Checksum 5421 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:39:50-PST using FastDB 

5 Optimized Score = 5 Significance = 2.39 

1007. Matches - 5 Mismatches ~ 0 

0 Conservative Substitutions = 0 

DLT3SDTFKEYDPDGKGI 1SKRDFHKAMESHKHYTQ3ETEFLLSCAETDENETLDYEEFVKRFHEPAKDIGF 
4030 4040 4050 4060 4070 4030 4090 


Initial Score = 
Residue Identity = 
Gaps = 





vLivir 


NVAVLLTNLSEHMPNDTRLGTFLELAESVLNYFGPFLGRIEIMGSAKRIERVYFEISESSRTQWEKPGVKES 
4100 4110 4120 4130 4140 4150 4160 

KRQFIFDVVNEGGEKEKMELFVNFCEDTIFEMQLAAQI5ESDLNERSANKEESEKERPEEQ 
4170 4180 4190 4200 4210 4220 4230 


CEL5A-13 (1-5) 


BVFFSL 


sol protein; large splice form - Fruit fly 


ENTRY 

TITLE 

DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 

#Authors 


BVFFSL #Type Protein 

sol protein* large splice form - Fruit fly 
(Drosophila nelanogaster) 

30-Jun-1992 ^Sequence 30-Jun-1992 #Text 30 
1221.0 1.0 1.0 1.0 1.0 
Drosophila nelanogaster 
A41146 


30-Jun-1992 


#Authors Delaney 5.J.* Hayward D.C . * Barlehen F.* Fischhach 

K.F.* Miklos G.L.G. 

#Journal Proc. Natl. Acad. Sci. U.S*A. (1991) 88:7214-721 

#Title Molecular cloning and analysis of small optic lo 

a structural brain gene of Drosophila 
nelanogaster. 

#Reference-number A41146 
^Accession A41146 

#Mo.lecule-type mRNA 
#Residues 1-1597 <DEL> 

#Cross-reference GB:M64084 

COMMENT The sol (small optic lobes) mutation eliminates 

certain classes of columnar neurons. 

COMMENT An alternate splice form of 395 amino acids is 

observed* in which the first 393 are identical 
the large sol protein. 

GENETIC 

#Segment-number 19F4 
#Name sol 

SUPERFAMILY #Name sol protein 

FEATURE 

1017-1320 #Domain calpain catalytic domain hone 


^Journal 
#Title 


7214-7213 • 

optic lobes 


protein 


10-29 

139-153 

647-667 

711-730 

752-771 

934-953 

1082 

1243 

1268 

SUMMARY 

SEQUENCE 

COMMENT 


#Ac tive-s 
#Molecular-weight 174713 


#Domain calpain catalytic domain homology 
< C A L > \ 

#Region zinc finger-like motif <ZN1>\ 
^Region zinc finger-like motif\ 

#Region zinc finger-like motif\ 

^Region zinc finger-like motif\ 

#Region zinc finger-like motif\ 

#Region zinc finger-like motif\ 
#Active-site Cys (predicted)\ 

#Active-site His (predicted)V 

#Active-site Asn (predicted) 

ht 174713 ^Length 1597 ^Checksum 8253 


Retrieved 


alexk on Thu 25 Feb 93 10:39 


-PST using FastDB 


Initial Score = 5 
Residue Identity = 100"/. 
Gaps — 0 


Optimized Score = 5 

Matches = 5 

Conservative Substitutions 


Significance 
Mi snatches 


AQLLSSRCVRFLMGASCGGGNMKVDEEEYGQKGLRPRHAYSVLDVKDIQGHRLLKLRNPWGHYSWRGDWSDD 
1220 - 1230 1240 1250 1260 1270 1280 


X . X 
VLNYF 




1290 


nioi-LiuvL.i'tiruuiuj^rwnabwi'itvra.mjiLUKLCblbOVLL'IVLEPTE 
1300 1310 X 1320 1330 1340 1350 


AEFTLFQEGQRNSEKSQRSQLDLCVVIFRTRSPAAF’EIGRLVEHSKRQVRGFVGCHKMLER 
1360 1370 1380 1390 1400 1410 


CELSA-13 (1-5) 

S0L_DR0ME SMALL OPTIC LOBES PROTEIN. 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

RN 

RP 

RC 

RM 

RA 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

DR 


STANDARD; 


PRT; 1597 AA, 


(REL . 
(REL. 
(REL. 
LOBES 


23 f CREATED) 

23, LAST SEQUENCE UPDATE) 
23, LAST ANNOTATION UPDATE) 
PROTEIN. 


SOL_DROME 
P27398; 

0 1 -AUG- 1992 
0 1 - AUG- 1992 
01 - AUG-1 992 
SMALL OPTIC 
SOL. 

DROSOPHILA MELANOGASTER (FRUIT FLY). 

eukaryota; metazoa; arthropoda; insecta 

L 1 ] 

SEQUENCE FROM N.A . 

TISSUE=BRAIN; 

91334436 
DELANEY S.J., 

MIKLOS G.L.G, 

PROC. NATL. ACAD. SCI. U.S.A. 88:7214-7218(1991). 

-!- THE SOL (SMALL OPTIC LOBES) MUTATION ELIMINATES CERTAIN CLASSES OF 
COLUMNAR NEURONS. 

ALTERNATIVE SPLICING: AN ALTERNATE SPLICE FORM OF 395 AMINO ACIDS 
IS OBSERVED, IN WHICH THE FIRST 393 ARE IDENTICAL TO THE LARGE 
PROTEIN. 

SIMILARITY: TO CALPAINS EUKARYOTIC THIOL PROTEASES. 

M64084; DMSOL. 


DIPTERA 


HAYWARD D.C., BARLEBEN F., FISCHBACH K.F 


SOL 


EMBL 


DR 

PIR; A41146 

; BVFFSL. 


DR 

FLYBASE; 03464; 

RELEASE 9206 

. 

KW 

ALTERNATIVE 

SPLICING? ZINC-FINGER; HYDROLASE? THIOL PROTEASE. 

FT 

ZN_FING 

10 

29 

C4-TYPE . 

FT 

ZN_FING 

139 

153 

C4-TYPE . 

FT 

ZN_FING 

647 

667 

C4-TYPE . 

FT 

DOMAIN 

673 

639 

GLN-RICH. 

FT 

DOMAIN 

690 

697 

POLY-HIS. 

FT 

ZN_FING 

71 1 

730 

C4-TYPE. 

FT 

ZN_FING 

752 

771 

C4-TYPE. 

FT 

Z N_FING 

934 

953 

C4-TYPE. 

FT 

DOMAIN 

1017 

1320 

CALPAIN CATALYTIC DOMAIN. 

FT 

ACT_SITE 

1032 

1032 

BY SIMILARITY. 

FT 

ACT_SITE 

1243 

1243 

BY SIMILARITY. 

FT 

ACT_SITE 

1263 

1268 

BY SIMILARITY. 

SG 

SEQUENCE 

1597 

AA; 174714 

MW? 1.268343E+07 CN? 

CC 

-f“ Retrieved by 

alexU on Thu 25 Feb 93 10?39?51-PST using FastDB 

I n i t i 

a1 Score 

= 

5 Optinis 

ed Score - 5 SiqniFicance = 2.39 

Residue Identity 
Gaps 

= 1007. Matches = 5 Mismatches = 0 

” 0 Corservative Substitutions = 0 


AQLL3SRCVRFLMGA3CGGGNMKVDEEEYQQKGLRPRHAYSVLDVKDIQGHRLLKLRNPWGHYSWRGDWSDD 

1220 1230 1240 1250 1260 1270 1280 

X X 
VLNYF 

I I I I i 

SSLWTDDLRDALMPHGA3EGVFWISFEDVLNYFDC1DICKVRSGWNEVRLQGTLQPLCSI3CVLLTVLEPTE 
1300 1310 X 1320 1330 1340 1350 


AEFTLFQEGQRNSEKSQRSQLDL C V VI F R T R S P A A P EI GR L. VE HS KR 3 VR Q F V QC HKM l 






4. CELSA-13 (1-5) 

A31S54 Cytochrome P450 51 1a nosiero1 14a 1pha-demethy1ase 

ENTRY A31854 #Type Protein 

TITLE Cytochrone P450 51 lanosterol 14alpha-denethyl ase - 

Imperfect fungus (Candida tropical is) 

ALTERNATE-NAME cytochrone P450 14DM 

DATE 07-Jun-1990 ^Sequence 07-Jun-1990 #Text 10-Aug-1992 

PLACEMENT 0.0 0.0 0.0 0.0 0.0 

SOURCE Candida tropicalis 

ACCESSION A31854 

REFERENCE 

^Authors Chen C., Kalb V.F., Turi T.G., Loper J.C. 

♦Journal DNA (1988) 7:617-626 

#Title Primary structure of the cytochrome P450 lanosterol 

14alpha-demethylase gene from Candida tropicalis. 

#Reference-number A31854 
#Accession A31854 

#Molecule-type DNA 
#Re sidues 1-528 <CHE> 

SUPERFAMILY #Name cytochrome P450 

KEYWORDS heme\ monooxygenase\ oxidoreductase 

FEATURE 

470 #Binding-site heme iron (Cys) (axial 

ligand) 

SUMMARY #Mo1ecular-weight 60927 #Length 528 #Checksum 4431 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:39:51-PST using FastDB 

Initial Score = 5 Optimized Score = 5 S igrri f i c ance = 2.39 

Residue Identity= 1007. Matches = 5 Mismatches = 0 

Gaps = 0 Conservative Substitutions = 0 

MQPYEFFEKCRLKYGDVFSFMLLGKVMTVYLGPKGHEFIYNAKLSDVSAEEAYTHLTTPVFGKGVIYDCPNS 
70 80 90 100 110 120 130 

X X 
VLNYF 

I I I I I 

RLMEQKKFAKFALTTDSFKTYVPKIREEVLNYFVNDVSFKTKERDHGVASVMKTQPEITIFTASRCLFGDEM 
140 150 160 X 170 180 190 200 

RKSFDRSFAQLYADLDKGFTPINFVFPNLPLPHYWRRDAAQRKISAHYMKEIKRRRESGDI 
210 220 230 240 250 260 270 


5. CELSA-13 (1-5) 

CP51_CANTR CYTOCHROME P450 LI (P450-L1A1) (LANOSTEROL 14-ALPH 

ID CP51_CANTR STANDARD; PRT; 528 AA. 

AC P14263; 

DT 01-JAN-1990 (REL. 13, CREATED) 

DT 01-NOV-1990 (REL. 16, LAST SEQUENCE UPDATE) 

DT 01-NOV-1990 (REL. 16, LAST ANNOTATION UPDATE) 

DE CYTOCHROME P450 LI (P450-L1A1) (LANOSTEROL 14-ALPHA DEMETHYLASE) 

DE (EC 1.14.14.1). 

GN CYP51 OR 14DM. 

OS CANDIDA TROPICALIS (YEAST). 

oc eukaryota; fungi; deuteromycotina (imperfect fungi). 

RN [ 1 ] 

RP SEQUENCE FROM N.A. 

RM 89152749 

RA CHEN C., KALB V.F., TURI T.G., LOPER J.C.; 


RP SEQUENCE DF 434-523 FROM N.A. 

RM 37293576 

RA CHEN C.> TURI T.G.* SANGLARD D.* LOF'ER U.C.; 

RL BIOCHEM. BIOPHYS. RES. COMMUN. 146:1311-1317(1937). 

CC -!- FUNCTION: CYTOCHROMES P450 ARE A GROUP OF HEME-THIOLATE 
CC MONOOXYGENASES. THEY OXIDIZE A VARIETY OF STRUCTURALLY UNRELATED 

CC COMPOUNDS). INCLUDING STEROIDS* FATTY ACIDS* AND XENOBIOTICS. 

CC -!- CATALYTIC ACTIVITY: 14-ALPHA-DEMETHYLATI ON OF LANOSTEROL. 

DR EMBL* M23673 * M23673. 

DR EMBL; M17595 * CT14DM. 

DR PIR; A31854 * A31S54. 

DR PIR; A26828» A26328. 

DR PROSITE; PS00086; CYT0CHR0ME_P450. 

KW ELECTRON TRANSPORT; OXIDOREDUCTASE; MONOOXYGENASE; MEMBRANE; 

KW HEME. 

FT BINDING 470 470 HEME. 

SQ SEQUENCE 528 AA; 60928 MW; 1513726 CN; 

CC -!- Retrieved by alexk on Thu 25 Feb 93 10:39:51-PST using FastDB 


Initial Score = 5 
Residue Identity = 100X 
Gaps = 0 


Optimized Score = 
Matches = 

Conservative Substitut 


5 Significance = 2.39 

5 Mismatches = 0 

ions = 0 


MQPYEFFEKCRLKYGDVFSFMLLGKVMTVYLGPKGHEFIYNAKLSDVSAEEAYTHLTTPVFGKGVIYDCPNS 
70 80 90 100 110 120 130 

X X 
VLNYF 

I • I I I 

RLMEQKKFAKFALTTDSFKTYVPKIREEVLNYFVNDVSFKTKERDHGVASVMKTQPEITIFTASRCLFGDEM 
140 150 160 X 170 ISO 190 200 

RKSFDRSFAQLYADLDKG'FTP INFVFPNLPLPHYWRRDAAQRK ISAHYMKEI KRRRESGDI 
210 220 230 240 250 260 270 


6. CELSA-13 (1-5) 

STP1_ARATH GLUCOSE TRANSPORTER (SUGAR CARRIER). 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

OC 

RN 

RP 

RC 

RM 

RA 

RL 

CC 

CC 

CC 

DR 

DR 

DR 

DR 

DR ' 

KW 


STANDARD; 


STP1_ARATH 
P23586; 

01-NOV-1991 (REL. 20, 
0 1 -NOV-1991 (REL. 20, 
01-MAY-1992 (REL. 22, 


PRT 


AA 


CREATED) 

LAST SEQUENCE UPDATE) 
LAST ANNOTATION UPDATE) 


CRESS) . 

angiospermae; 


GLUCOSE TRANSPORTER (SUGAR CARRIER) 

STP 1 . 

ARAB IDOPSIS THALIANA (MOUSE-EAR 

eukaryota; planta; embrydphyta; 

CAPPARALES; CRUCI FERAE. 

C 1 3 

SEQUENCE FROM N.A. 

STRAIN=CV. LANDSBERG ERECTA; 

91005995 

SAUER N.* FRIEDLAENDER K.» GRAEML-WICKE 
EMBO J. 9:3045-3050(1990). 

-!- FUNCTION: ACTIVE UPTAKE OF HEXOSES. 
SYMPORT. 

-!- SIMILARITY: TO OTHER EUKARYOTIC AND 
EMBL? X 55350; ATSTP1, 

PIR; S12042; .S12042. 

PIR* SI 4627; SI 4627. 

PROSITE; P300216; SUGAR_TRANSPORT 1. 
PROSITE; PS00217! SUGAR TRANSPORT 
DUPLICATION 

- T . r . - i r.nyu-M ■ 


DICOTYLEDONEAE; 


PROBABLE GLUCOSE/HYDROGEN 
PROKARYOTIC SUGAR SYMPORTERS 


fRANSMEMBRANE; SUGAR TRANSP0RT ! S YMF0RT 



FT 

i nnnut lull 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

FT 

TRANSMEM 

SQ 

SEQUENCE 

CC 

-!- Retr 


\J ~T 

1 *T 

1 13 

133 

141 

161 

168 

188 

200 

220 . 

285 

305 

321 

341 

349 

369 

335 

405 

424 

444 

453 

473 

522 AA 

? 57596 

ed by 

alexk on 


ruiciMi irtL. 

POTENTIAL. 
POTENTIAL- 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
MW? 1520081 
Thu 25 Feh 93 


CN; 

10 : 


39 i 


~PST using FastDB 


Initial Score 
Residue Identity 
Gap 5 


5 Optimized Score = 5 Significance 

100"/. Matches = 5 Mismatches 

0 Conservative Substitutions 


39 

0 

0 


SSLYLAALI SSLVASTVTRKFGRRLSMLFGGILFCAGALINGFAKHVWMLIVGRILLGFGIGFANQAVPLYL 
90 100 110 120 130 140 150 


X X 
VLNYF 


SEMAPYKYRGALNIGFQLSITIGILVAEVLNYFFAKIKGGWGWRLSLGGAVVPALI ITIGSLVLPDTPNSMI 
160 170 ISO 190 200 210 220 230 


ERGGHEEAKTKLRRIRGVDDVSQEFDDLVAASKESQSIEHPWRNLLRRKYRPHLTMAVMIP 

240 250 260 270 280 290 


7 


CELSA-13 (1“5) 

S14627 ^Glucose transport 


protein 


ENTRY 

TITLE 

DATE 

PLACEMENT 

COMMENT 

SOURCE 

REFERENCE 


S14627 #Type Protein 

#G1ucose transport protei 
28-Aug-1992 ^Sequence 23“ 
0.0 0.0 0.0 0 
#This entry is not verifi 
Arabidopsis thaliana #Com 


#Authors Sauer N., Friedl K.r Wick 

^Citation submitted to the EMBL Dat 

#Title Primary structure- genomi 

heterologous expression 
from Arabidopsis thalia 
#Reference-number S14627 
^Accession S14627 

^Cross-reference EMBLJX55350 
SUMMARY #Molecular-weight 57596 #L 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 


Initial Score 
Residue Identity 
Gaps 


5 Optimized Score 
100"/. Matches 

0 Conservative Su 


SSLYLAALISSLVASTVTRKFGRRLSMLFGGILFCAGA 
90 100 110 120 


X X 
VLNYF 

MM! 

SEMAPYKYRGALNIGFQL5ITIGILVAEVLNYFFAKIK 
160 170 180 190 


- Arabidopsis thaliana 


n - Arabidopsis thaliana 
Aug-1992 #Text 28-Aug-1992 
.0 0.0 
ed. 

mon-name mouse-ear cress 
e U. 

a Library, October 1990 
c organization & 
of a glucose transporter 
na. 


ength 522 ^Checksum 5679 

25 Feb 93 10:39:51-PST using FastDB 

~ 5 Significance = 2.39 

= 5 Misnatches = 0 

bstitutions = 0 

LINGFAKHVWML IVGRILLGFGIGFANQAVPLYL 
130 140 150 


GGWGWRLSLGGAVVPALI ITIGSLVLPDTPNSMI 

200 210 220 230 




?40 


ut/ l _vf-.fior\ca\aoiC.tirwrtl\lLLKKKYKPHL[l v lAVMIP 
' 260 270 230 290 


8 . CELSA-13 ( 1 - 5 ) 


SI 2042 

ENTRY 

TITLE 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
#Authors 
#Journa 1 
#Title 


»Sugar transport protein STP 1 - Arabidopsis thalia 
SI2042 #Type Protein 

*Sugar transport protein STP 1 - Arabidopsis thaliana 
26-Sep-1992 ^Sequence 26-Sep-1992 #Text 26-Sep-1992 
0-0 0.0 0.0 0.0 0.0 
*This entry is not verified. 

Arabidopsis thaliana #Common-name mouse-ear cress 


Sauer N., Friedlaender K., Graeml-Wicke U. 

EMBO J. (1990) 9:3045-3050 

Primary structure, genomic organization and 

heterologous expression of a glucose transporter 
from Arabidopsis thaliana. 

#Reference-number S12042 

(♦Accession S12042 

#Cross-reference EMBL:X55350 

SEQUENCE #Molecular- weight 57596 #Length 522 Checksum 5679 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:39:51-PST using FastDB 


Initial Score = 
Residue Identity = 
Gaps = 


5 Optimized Score = 
100X Matches = 


5 


Conservative Substitutions 


Significance = 2.39 

Mismatches = 0 


SSLYLAAL^SSLVASTVTRKFGRRLSMLrGGILFCAGAl^NGFAKnVWMLIVGRILLGFGIGFANQAVPLYL 

90 100 HO 120 130 140 150 

X X 
VLNYF 

SEMAPYKYRGALNIGFQLSITIGILVAEVLNYFFAKIKGGWGWRLSLGGAVVPALIITIGSLVLPDTPNSMI 

160 170 130 190 200 210 220 230 

ERGQHEEAKTKLRRIRGVDDVSQEFDDLUAASKESQSIEHPWRNLLRRKYRPHLTMAVMIP 

240 250 260 270 230 290 


9 . CELSA-13 ( 1 - 5 ) 


B35901 

ENTRY 

TITLE 

DATE 

PLACEMENT 

COMMENT 

SOURCE 

REFERENCE 

♦♦Authors 


(♦Journal 

♦♦Title 


■^Calcium channel alpha -1 chain, dihydropyridine 

B35901 #Type Protein (fragment) 

•■Calciupi channel alpha -1 chain, d i hydropyr i di ne 
sensitive, homolog B, brain - Rat (fragment) 

18-Apr-1991 ^Sequence 18-Apr-1991 #Text 18-Apr-1991 
0.0 0.0 0.0 0.0 0.0 
#This entry is not verified. 

Rattu 5 norvegicus #Common-name Norway rat 


Snutch T.P., Leonard J.P., Gilbert M.M., Lester 
H.A., Davidson N. 

Pr-oc. Natl. Acad. Sc i . U.S.A. (1990) 87:3391-3395 

Rat brain expresses a heterogeneous family of 
calcium channels. 

#Ref erenc e-number- A35901 
♦♦Accession B35901 

SEQUENCE ♦♦ Length 164 ^Checksum 2550 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:3.9:52-PST using FastDB 



I \ c- D 1 uuc xu«rr»ux u y - JUU/. riduirieb '= D I’ll SHaiCnSS = U 

Gaps = 0 Conservative Substitutions = 0 


XX- 

VLNYF 

Mill 

FSLECILKIIAFGVLNYFRDAWDVFDFVTVLGSITDILVTEIANNFINLSFLRLFRAARLIKLLRQGYTIRI 
10 X 20 30 40 50 60 70 

LLWTFVQSFKALPYVCLLIAMLFFIYAIIGMQVFGNIALDDGTSIN 
SO 90 100 110 


10. CELSA-13 (1-5) 


JQ0700 

Hypothetical 11K protein (nnoC 

5' 

region) - 

ENTRY 

JQ0700 #Type Protein 



TITLE 

Hypothetical 11K protein (nmoC 

5 7 

region) - 


Methylococcus capsulatus 



ALTERNATE-NAME 

hypothetical protein V 



DATE 

31-Mar-1992 ^Sequence 31-Mar-1992 

#Text 31-Mar 


PLACEMENT 

0.0 0.0 0.0 0.0 0.0 


SOURCE 

Methylococcus capsulatus 


ACCESSION 

JQ0700 


REFERENCE 



#Author s 

Stainthorpe A.C., Lees V., Salmond G.P.C., Dalton 

H., Murre1l J.C. 


#Journal 

Gene (1990) 91:27-34 


#Tit1e 

The methane monooxygenase gene cluster of 

Methylococcus capsulatus (Bath). 


ttRef erence 

-number JQ0700 


^Accession 

JQ0700 


#Molecule- 

type DNA 


#Residues 

1-103 <STA> 


SUMMARY 

#Moleculai—weight 11942 ttLength 103 #Checksum 3171 


SEQUENCE 


COMMENT 

Retrieved by alexk on Thu 25 Feb 93 10;39:52-PST using 

FastDB 

Initial Score 

= 5 Optimized Score = 5 Significance = 2 

.39 

Residue Identity = 1007. Matches = 5 Mismatches 

0 

Gaps 

- 0 Conservative Substitutions = 

0 


MVESAFQPFSGDADEWFEEPRPQAGFFFSADWHLLKRDETYAAYAKDLDFMWRWVIVREERIVQEGCSISLE 

10 20 30 40 50 60 70 


X X 
VLNYF 

I I I I I 

SSIRAVTHVLNYFGMTEQRAPAEDRTGGVQH 
SO X 90 100 


11. CELSA-13 (1-5) 

YMMO_METCA HYPOTHETICAL 11.9 KD PROTEIN IN MMOZ-MMOC INTERGEN 


ID YMMO_MET CA 

AC P22S67; 

DT 01-AUG-1991 

DT 01-AUG-1991 

DT 
DE 
OS 
OC 
OC 
RN 


standard; 

CREA 
LAST 
LAST 
9 KD PROTE 
CAP5UL.ATUS. 
GRACILICUTES; 


(REL. 19, 
(REL. 19, 
01-NOV-1991 (REL. 20, 
HYPOTHETICAL 11 
METHYLOCOCCUS 

prokaryota; 
METHYLOCOCCACEAE 
[ 1 ] 


PRT; 103 AA. 

TED) 

SEQUENCE UPDATE) 

ANNOTATION UPDATE) 

IN IN MMOZ-MMOC INTERGENIC REGION (ORFY). 
SCOTOBACTERIAJ AEROBIC RODS AND COCCI; 






RM 

90382694 



RA 

STAINTHORPE A 

.C.r LEES V. 

- SALMOND G. 

RL 

GENE 91?27-34(1990). 


DR 

EMBLf MSS498? 

MCMMOA. 


DR 

pir; JQ0700; 

JG0700. 


KW 

HYPOTHETICAL 

PROTEIN. 


SQ 

SEQUENCE 103 AA? 11942 

MW? 52861 

cc 

-!- Retrieved 

by alexk on 

Thu 25 Feb 

I n i t i 

al Score = 

5 Opti 

mired Score 

Residue Identity = 

100% Matches 

Gaps 

= • 

0. Conservative Sub 


P.C, 


CN: 

93 


DALTON H, 


MURRELL J.C, 


10i39» r 52-PST using FastDB 


5 Significance = 
5 Mismatches = 


2.39 


MVESAFQPFSGDADEWFEEPRPQAGFFPSADWHLLKRDETYAAYAKDLDFMWRWVIVREERIVQEGCSISLE 

20 30 40 50 60 70 

X X 
VLNYF 

I I I I I 

SSIRAVTHVLNYFGMTEGRAPAEDRTGGVQH 
30 X 90 100 


o o 



□ | |0 Intel 1iGenetics 

> □ < 


FastDB ~ Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file ce1sa-3-a1ign.res made by alexk on Thu 25 Feb 93 10:43:03-F'ST. 

Query sequence being compared: CELSA-3 (1-5) 

Number of sequences searched: 50 

Number of scores above cutoff: 50 

Results of the initial comparison of CELSA-3 (1-5) with: 

File : celsa-3.pep 


100 - 

N 

U 50- 

M 

B 

E - - 
R 

0 

F 10- 
S 

E 5- 

Q 

U 

E 

N 

C 

E 


S 0 


1 

i 11 

i i i 

i i 

1 1 i 

SCDRE 0 

i 11 

2 | 2 

3 | 

.3 4 4 

STDEV 

-7 

-3 ‘ 

0 




PARAMETERS 


Sinilariiy 

matr i x 

Unitary 

K-tuple 

2 

Mismatch penalty 

5 

Joining penalty SO 

Gap penalty 


1.00 

Window size 

5 

Gap size penalty 

0.26 



Cutoff score 

0 



Randomization group 

0 





SEARCH 

STATISTICS 


Scores : 


Mean 

Median 

Standard Deviation 



4 

' 5 

0.30 

T i me s : 


CPU 


Total Elapsed 


00 

tH 

o 

o 

o 

o 

o 


o 

o 

o 

o 

o 

o 

o 

o 

Number of r 

esidues : 


3896 


Number of s 

eque rices searched: 

50 


Number of s 

cores a b ov e 

cutof f : 

50 





*-* a => t u) 


iniudi score- 


5 100/. similar sequences to the query sequence mere. Found? 


Sequence Name 


Description 


Init. Opt.. 
Length Score Score 


Frame 


A26879 
PULA_KLEAE 
S12SSS 
DBH THETH 


list of other best scores 


Pullulanase 

protein. 

1096 

5 

5 

3.30 

aipha-Dextrin endo-1,6-a 1pha- 

1096 

■5 

5 

3.30 

PULLULANASE 

(EC 3.2.1.41) (AL 

1096 

5 

5 

3.30 

*DNA-binding 

protein II - The 

95 

5 

5 

3„30 

DNA-BINDING 

PROTEIN II. 

95 

5 

5 

3.30 


Sequence Name 


Description 


Init. Opt. 
Length Score Score 


S1 0561 
CCPS5S 
C.551 _PSEST 
CCPB6 

CYC6_PLEB0 
CCA 16 

CYC6_ANAVA 

DNBS2F 

DBH_BACST 

S00015 

B39409 

DBH_BACSU 

R05446 

QQECRP 

S06985 


#Photosystem II 21K protein - 
Cytochrome c551 - Pseudomonas 
CYTOCHROME C551. 

Cytochrome c6 — Plectonema bo 
CYTOCHROME C6 (SOLUBLE CYTOCH 
Cytochrome c6 - Anabaena vari 
CYTOCHROME C6 (SOLUBLE CYTOCH 
DNA-binding protein II - Baci 
DNA-BINDING PROTEIN II (HB) ( 
DNA-binding protein HB - Baci 
*DNA-binding protein HB - Bac 
DNA-BINDING PROTEIN II (HB) ( 

CAT-GLP-1 hybrid protein. 
Hypothetical protein A-105 - 
Hypothetical protein (nifHl 3 


Sig. Frame 

0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 0 
0.00 o 
0.00 0 
0.00 0 
0.00 o 
0.00 0 


1. CELSA-3 (1-5) 
PS2507 


Pullulanase proteii 


PS2507 standard; protein; 1096 AA. 


01-N0V-1990 (first entry) 

Pullulanase protein. 

Pullulanase; starch; alcohol prodn. 

Klebsiella aerogenes. 

J63245676-A. 

1 2-0CT- 1988 . 

31-MAR-1987; 073355. 

31-MAR-1987; JP-078355. 

MDT ED i Der,ki Ka 9 aku Kogyo KKf (SUNR) Suntory Ltd. 

WPI; 38-333488/47. y 

N-PSDB; NS 1341. 

S™: d * r,v *° fr °" —->'«-*■» ^ 

Disclosure; p; Japanese. 

5 r ° t#in Cleaves a Vha-l,6-glucoside bonds of starch and 
r i*‘ n c!eco,,1 f : ' 05iticiri of branched starch. It is used in the 
prodn. u. maltose and glucose from starch, and of alrohol from starch 
via glucose. Amino acid residues 1-19 can be delated 
Sequence 1096 AA; 

mV V R: -? 3 D! 0 B! 10 Ci 51 a> 42 e, o 2, 89 ci 8a hi 

3 L dl Ml 31 F; 5a PI 96 SI 70 Tils Ml 41 y; SI V; 

Retrieved by dtexk on Thu 25 Feb 93 10: 99; o?-pct 


10 c; 

52 P; 


42 e; o z; 

70 T;13 Wi 


89 G: 
41 Y : 


3 c o r e 


:,n Thu 25 Feb 93 10? 29 ? 02-PST using FastDB 
>Jp u i m i 'ed Score = 5 Ri gni f'icancs 


3.30 





jop 3 


'-uribtffVdLi vy OUL'bl/l LUt ions 


u 


SF3DFTDRTVSVIAGNSAVYDSRADAFRAAFGvALADAHWVDKTTLLWPGGENKPIVRLYYSHSSKVAADSN 
160 170 180 190 200 210 220 230' 

XX 

YPAFK 

III!! 

GEFSDKYVKLTPTTVNQQVSMRFPHLASYPAFKLPDDVNVDELLQGDDGGIAESDGILSLSHPGADRRRAGR 
240 250 260 X 270 280 290 300 

YLCRRAEALSYGAQLTDSGVTFRVWAPTAQQVELVIYSADKKVIASHPMTRDSASGAWSWQ 
310 320 330 340 350 360 


2. CELSA-3 (1-5) 

A26879 alpha-Dextrin endo-1 r 6-a lpha-g 1 ucos 1 dase precursor 

ENTRY A26S79 #Type Protein 

TITLE a 1pha—Dextrin endo—1 f 6-alpha-glucos idase precursor - 

Klebsiella pneumoniae #EC-number 3.2.1.41 
ALTERNATE-NAME pullulanase 

DATE 30-Jun-1988 #Sequer.ce 30-Jun-1988 #Text 31-Sep-1992 

PLACEMENT 0.0 0.0 0.0 0.0 0.0 

SOURCE Klebsiella pneumoniae 

ACCESSION A26879 

REFERENCE (K. aerogenesF strain W70) 

#Authors Katsuragi N.» Takizauia N.r Murooka Y. 

#Journal J. Bacteriol. (1937) 169:2301-2306 

#Title Entire nucleotide sequence of the pullulanase gene 

. of Klebsiella aerogenes W70. 

#Reference-number A26879 
#Accession A26879 

#Mo1 ecule-type DNA 
#Residues 1-1096 <KAT> 

GENETIC 

•ttName pulA 

KEYWORDS glycosidase\ hydrolase 

FEATURE 

1 - 19 #Domain signal sequence <SIG>\ 

20-1096 #Prc>tein al pha-dex tr i n endo-1 » 

6-a 1pha-g1ucosidase <MAT> 

SUMMARY ^Molecular-weight 119335 #Length 1096 ^Checksum 1390 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:29:02-PST using FastDB 

Initial Score = 5 Optimized Score = 5 Significance = 3.30 

Residue Identity = 1007. Matches = 5 Mismatches = 0 

Gaps = 0 Conservative Substitutions = 0 

SFSDFTDRTVSVIAGNSAVYDSRADAFRAAFGVALADAHWVDKTTLLWPGGENKPIVRLYYSHSSKVAADSN 
160 170 180 190 200 210 220 230 

XX 

YPAFK 

II I I I 

GEFSDKYVKLTPTTVNQQVSMRFPHLASYPAFKLPDDVNVDELLQGDDGGIAESDGILSLSHPGADRRRAGR 
240 250 260 X 270 280 290 300 

YLCRRAEALSYGAQLTDSGVTFRVWAPTAQQVELV1YSADKKVIASHPMTRDSASGAWSWQ 
310 320 330 340 350 360 


3. CELSA-3 (1-5) 

PULA_KLEAE PULLULANASE (EC 3.2.1.41) (ALPHA-DEXTRiN ENDO-1 r6 





o i m'u/Hnu » 


rni i 


1U70 HH. 


AC P07811 ; 

DT 01—AUG-1788 (REL. 08, CREATED) 

DT 01-AUG-1988 (REL. 08, LAST SEQUENCE UPDATE) 

DT 01-APR-1990 (REL. 14, LAST ANNOTATION UPDATE) 

DE PULLULANASE (EC 3.2.1.41) (ALPHA-DEXTRIN ENDO-1, 6-ALPHA-GLUC051DASE) 

DE (PULLULAN 6-GLUCANOHYDROLASE). 

GN PULA. 

OS KLEBSIELLA AEROGENES. 

OC PROKARYOTA; GRACILICUTES; SCOTOBACTERIA; FACULTATIVELY ANAEROBIC RODS; 
OC ENTEROBACTERIACEAE. 

RN C 1 3 

RP SEQUENCE FROM N.A. 

RC STRAIN=W70; 

RM 87194626 

RA KATSURAGI N., TAKIZAWA N., MUROOKA Y.; 

RL J. BACTERIOL. 169:2301-2306(1987). 

CC -!- CATALYTIC ACTIVITY? STARCH-DEBRANCHING ENZYME, HYDROLYZES 
CC <1-6)-ALPHA-GLUCOSIDIC LINKAGES IN PULLULAN AND STARCH TO 

CC FORM MALTOTRIOSE. 

CC -!- SUBUNIT: HOMOTRIMER. 

DR EMBL; M161S7; KAPULA. 

DR PIR; A26879; A26879. 

DR PROSITE; PS00013; PROKAR_LIPOPROTEIN. 

KW HYDROLASE; GLYCOSIDASE; LIPOPROTEIN; SIGNAL. 

FT SIGNAL 1 19 

FT CHAIN 20 1096 PULLULANASE. 

FT LIPID 20 20 N-ACYL DIGLYCERIDE. 

SQ SEQUENCE 1096 AA; 119335 MW; 5852107 CN; 

CC -!- Retrieved by. alexk on Thu 25 Feb 93 10:29:02~P5T using FastDB 

Initial Score = 5 Optimized Score = 5 Significance = 3.30 

Residue Identity = 1007. Matches = 5 Mismatches = 0 

Saps . = 0 Conservative Substitutions = 0 

SFSDFTDRTVSVIAGNSAVYDSRADAFRAAFGVALADAHWVDKTTLLWPGGENKPIVRLYYSHSSKVAADSN 
160 170 180 190 200 210 220 230 

X X 
YPAFK 

II I I I 

GEFSDKYVKLTPTTVNQQVSMRFPHLASYPAFKLPDDVNVDELLQGDDGGIAESDGILSLSHPGADRRRAGR 
240 250 260 X 270 . 280 290 300 

YLCRRAEALSYGAQLTDSGVTFRVWAPTAQQVELVIYSADKKVIASHPMTRDSASGAWSWQ 
310 320 330 340 350 360 


4. CELSA-3 (1-5) 

■ S12838 *DNA-binding protein II - Thermus aquaticus 


ENTRY 

TITLE 

DATE 

PLACEMENT 

COMMENT 

SOURCE 

REFERENCE 


S12888 #Type Protein 

#DNA-binding protein II - Thermus aquaticus 
11-Apr-1992 #Sequence ll-Apr-1992 #Text ll-Apr-1992 
0.0 0.0 0.0 0.0 0.0 
#This entry is not verified. 

Thermus aquaticus 


#Authors Zierer R., Choli D. 

#Journa 1 FEES Lett. (1990) 273:59-62 

#Title The primary structure of DNA binding protein II 

the extreme thermopi.h i l i c bacterium Thermus 
thermophi1 us. 

#Reference-number S12388 
^Accession Si 2838 


f r o m 



COMMENT 


Retrieved by ale;:k or. Thu 25 Feb 93 10:29:02-PST using FastDB 

Initial Score = 5 Optimized Score = 5 Significance = 3.30 

Residue Identity = . 100'/. Matches = 5 Mismatches = 0 

Gaps = 0 Conservative Substitutions . • =0 

AAKKTVTKADLVDQVAQATGLKLLDVKAMVDALLAKVEEALANG5KVQLTGFGTFEVRKRKARTGVKPGTKE 
10 20 30 40 50 60 70 

X X 
YPAFK 

I I I I I 

KIKIPATQYPAFKPGKALKDKVK 
SO X 90 


5. CELSA-3 (1-5) 

DBH THETH DNA- 


BINDING PROTEIN II 


ID 

AC 

DT 

DT 

DT 

DE 

OS 

OC 

OC 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

KW 

SQ 

CC 


STANDARD; 


PRT i 


AA, 


(REL. 17, CREATED) 

(REL. 17, LAST SEQUENCE UPDATE) 

(REL. 18, LAST ANNOTATION UPDATE) 

PROTEIN II. 

THERMOPHILUS). 

SCOTOBACTERIA; AEROBIC RODS AND COCCI 


DBH_THETH 
P19436; 

01-FEB-1991 
01-FEB-1991 
01-MAY-1991 
DNA-BINDING 

THERMUS AQUATICUS (SUBSP 
prokaryota; GRACILICUTES 
UNCERTAIN. 

Z 1 ] 

SEQUENCE. 

91032203 

ZIERER R., CH0LI D.; 

FEBS LETT. 273:59-62(1990). 

-!- FUNCTION: THIS PROTEIN BELONGS OT THE HISTONE LIKE FAMILY OF 
PROKARYOTIC DNA-BINDING PROTEINS WHICH ARE CAPABLE OF WRAPPING 
DNA AND TO STABILIZE DNA FROM DENATURATION UNDER EXTREME 
ENVIRONMENTAL CONDITIONS. 

-!- SUBUNIT: HOMODIMER. 

P.IR,' SI2838; SI 2888. 

PROSITE; PS00045; HISTONE_LIKE. 

DNA-BINDING; DNA CONDENSATION. 

SEQUENCE 95 AA; 10163 MW; 48066 CN; 

-!- Retrieved by alexk on Thu 25 Feb 93 10.’29:02-PST using FastDB 


Initial Score = 5 
Residue Identity = 1007. 
Gaps = 0 


Optimized Score = 5 

Matches = 5 

Conservative Substitutions 


u s l ng 

Significance = 
Mismatches = 


3.30 

0 

0 


AAKKTVTKADLVDQVAQATGLKLLDVKAMVDALLAKVEEALANGSKVQLTGFGTFEVRKRKARTGVKPGTKE 
10 20 30 40 50 60 70 

XX 

YPAFK 


KIKIPATQYPAFKPGKALKDKVK 
80 X 90 



0| |0 IntelliGenetics 

> 0 < 


FastDB Fast Pairwise Comparison of Sequences 
Release 5.4 


Results file cel sa-al i.gn .res made by alexU on Thu 25 Feb 93 10:44:2i-PST 


Query sequence being compared; CELSA -1 ( 1 - 6 ) 

Number of sequences searched; 390 (interrupted) 

Number of scores above cutoff: 757 


Results of the initial comparison of CELSA -1 (1-6) with; 
File : *.pep 


1000 - 

N 

U 500- 
PI 


E 

R 

* 

0 

F 100- 
5 

E 50- 

Q 

U 

E 

N 

C 

E 

S 10- 




# 


# 


5- 


* 


0 - 

ii it 

SCORE 0 | 1 | 

3TDEV -1 0 


II III 

2| 3 | 3 

1 2 


4| 

3 


5 


Similarity matrix 
1ismatch penalty■ 
jap penalty 
jap size penalty 
Cutoff score 
Randomization group 


PARAMETERS 


Unitary 


1 

0 


00 

26 

0 

0 


K-tup 1 e 

J oinin g penalty 
Window size 


SEARCH STATISTICS 


scores 


Mean 


Standard Devi a tion 



Median 


0 J o in 




Times: 


CPU 

00:00:05.04 


Total. Elapsed 
00:00:07.00 


Nlumber of residues: 442959 
Number of sequences searched: 890 
Number of scores above cutoff: 757 




□ I |0 Intel 1iGenetics 

> □ < 


F a s t D B — Fast Pairwise Conpari son of Sequences 
Release 5,4 

Results file cel s a-4-al igr*. res made by alexk on Thu 25 Feb 93 10 : 51 : 5S-PST . 


Query sequence being compared? CELSA-4 (1-5) 

Number of sequences searched? 50 

Number of scores above cutoff? 50 

Results of the initial comparison of CELSA-4 <1-5) with: 
File : celsa-4.pep 


100 - 


N 

U 50- 
fl 
B 
E 


0 

F 10- 
S 

E 5- 

Q 

U 

E 

N 

C 

E 


S 0 


11 

1 

1 1 

1 

1 i 

| 

| | 

I 

i 

SCORE 0| 

1 

1 1 

a 

1 2 

3 

1 3 

i 

4 

i 

4 

STDEV -6 


-4 


-a 


0 





PARAMETERS 


Similarity matrix 

Unitary 

K-tup1e 

2 

Mismatch penalty 

5 

Joining penalty ao 

Gap penalty 

1.00 

Window size 

5 

Gap size penalty 

0.26 



Cutoff score 

0 



Randomization group 

0 




SEARCH 

1 STATISTICS 


Scores : 

Mean 

Median 

Standard Deviation 


4 

5 

0 .48 

T i mes : 

CPU 


Total Elapsed 


t 

o 

o 

o 

o 

o 

o 

o 


o 

o 

o 

o 

o 

o 

o 

o 

Number of residues? 


S503 


Number of sequences 

searched: 

50 


Number* of scores above cutoff: 

50 





^OLumiaucu LJ a ^ *=* U Ufl iril 11 d l SCO f 0 


17 1007. similar sequences to the query sequence were found: 


Sequence Name 

Description 

Length 

I n i t. 
Score 

Opt. 
Score 

S i g . 1 

Frame 

1. S20590 

^Sialidase — Actinomyces vise 

913 

5 

5 

2.09 

0 

2. A33S27 

^Regulatory protein ral2 - Ye 

611 

5 

5 

2.09 

0 

3. RAL2_SCHP0 

RAL2 PROTEIN. 

611 

5 

5 

2.09 

0 

4. WMCVFM 

Inclusion body matr i >: protein 

512 

5 

5 

2.09 

0 

5. IBMP_FMVD 

INCLUSION BODY MATRIX PROTEIN 

512 

5 

5 

2.09 

0 

6. A30828 

Steroid 17alpha-nonooxygenase 

507 

5 

5 

2.09 

0 

7. S16719 

#Steroid 17a1pha-monooxygenas 

507 

5 

5 

2.09 

0 

8. CPT1_RAT 

CYTOCHROME P450 XVIIA1 (P450- 

507 

5 

5 

2.09 

0 

9. JH0594 

Vasoactive intestinal polypep 

.459 

5 

5 

2.09 

0 

10. S16562 

#nolF protein - Rhizobium mel 

367 

5 

5 

2.09 

0 

11. N0LF_RHI ME 

N0DULA.TION PROTEIN N0LF. 

367 

5 

5 

2.09 

0 

12. A27659 

Cytochrome P450 17 - Rat (fra 

237 

5 

5 

2.09 

0 

13. A33980 

Steroid 17alpha — monooxygenase 

235 

5 

5 

2.09 

0 

14. S18659 

^Hypothetical protein - Mycop 

148 

5 

5 

2.09 

0 

15. R20793 

CDR-grafted* humanised heavy 

146 

5 

5 

*2.09 

0 

16. S06727 

^Hypothetical protein 1 (mini 

122 

5 

5 

2.09 

0 

17. YM2_STRC0 

MINI-CIRCLE HYPOTHETICAL 13.3 

122 

5 

5 

2.09 

0 

The list of other 

Sequence Name 

best scores isi 

Description 

Length 

I n i t. 
Score 

Opt. 
Score 

Sig. Frame 

18. A35776 

#Re.cQ protein - Escherichia c 

5 

4 

4 

0.00 

0 

19. S04171 

aadA protein - Klebsiella pne 

23 

4 

4 

0.00 

0 

20. RP30_YEAST 

RIB0S0MAL PROTEIN RP30 (FRAGM 

25 

4 

4 

0.00 

0 


1. CELSA-4 (1-5) 
S20590 ' 


ENTRY 

TITLE 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
#Authors 
^Journal 
#Title 


*Sialidase — Actinomyces viscosus 4EC-number 
S20590 #Type Protein 

*Sialidase - Actinomyces viscosus #EC-number 

3.2.1.18 

22-Ju1-1992 ^Sequence 22-Jul-l?92 #Text 22-Jul-l992 
0.0 0.0 0.0 0.0 0.0 
*This entry is not verified. 

Actinomyces viscosusi 


Henningsen M., Roggentin P., Schauer R. 

Biol. Chem. Hoppe-Seyler (1991) 372:1065-1072 
Cloning, sequencing and expression of the sialidase 
gene from Actinomyces viscosus DSM 43798. 

#Reference-number S20590 
. #Accession S20590 
#Cross-reference EMBL:X62276 

SUMMARY #Molecular-ueight 96216 #Length 913 ^Checksum 4303 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:4£:00-PST using FastDB 


Initial. Score = 
Residue Identity = 
Gaps = 


5 Optimized Score = 5 

100“/. Matches ~ 5 

0 Conservative Substitutions 


Significance = 
Misnatches = 


! .09 
0 
0 


PMGTCSSPTTSARRTTATAAATTPNPNHIVQRRSTDGGKTWSAPTYIHQGTETGKKVGYSDPSYYVDHQTGT 

3 30 340 350 360 370 -380 390 






IFNFHVKSYDQGWGGSRGGTDPENRGIIQAEVSTSTDNGWTWTHRTI TAD ITKDKPWTARFAASGQGIQIQH 
400 410 420 X 430 440 450 460 

GPHAGRLVQOYTIRTAGGPVQAVSVYSDDHGKTWQAGTPIGTGMDENKVVELSDGSLMLNS 
470 480 490 500 510 520 


2. CEL5A-4 (1-5) 
A33827 


ENTRY 

TITLE 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
^Authors 
#Journal 
#Tit1e 


^Regulatory protein ra!2 - Yeast 

A33327 #Type Protein 

^Regulatory protein ra!2 - Yeast 
(Schizosaccharomyces pombe) 

19-Sep-1992 ^Sequence 19-Sep-1992 #Text 19-Sep-1992 
0.0 0.0 0.0 0.0 0.0 
#This entry is not verified. 

Schizosaccharomyces pombe 


Fukui Y. > Miyake S.» Satoh M. ? Yamamoto M. 

Mol. Cell. Biol. (1939) 9:5617*5622 
Characterization of the Schizosaccharonyces ponbe 
ra!2 gene implicated in activation of the rasl 
gene product. 

#Reference-number A33S27 
#Accession A33827 
#Cross-reference GB:M30S27 

SUMMARY tfMolecular-ueight 69847 #Length 611 #Checksum 9734 

SEQUENCE 

COMMENT Retrieved by alexk or. Thu 25 Feb 93 10?42?00~PST using FastDB 


Initial Score = 5 
Residue Identity = 1007. 
Gaps = 0 


Optimized Score = 5 

Matches = 5 

Conservative Substitutions 


Significance = 2.09 

Mismatches = 0 

= 0 


VVFGNKTRKLTQDYNLRQSNYDHIVFIELEGYGVYRKPQMGRVTERSEQLGKLLLNGISDMEILTIERMHIP 
380 390 400 410 420 430 440 

XX 

QAEVS 

I I I I I 

CLSRMLYh'RWP AFQh'IMDRAVEKNQEAFQAEVSQLGPOLTDLPFSS IHSTGSRALYMPYSFETCSAFLHYIY 
450 460 470 480 490 500 510 520 

CGTLNGSYCTAKNLCNLLILCKGFEGLETFFAYIVHLLHGVLNRNNVKLIYETAALTGAKG 

530 540 550 560 570 580 


3. CELSA-4 (1-5) 

RAL2_SCHP0 RAL2 PROTEIN. 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

RN 

RP 

RM 

RA 


RAL2_SCHP0 
P15258; 

01-APR-1990 
01-APR-1 990 
01-APR-1990 
RAL2 PROTEIN 


STANDARD; PRT; 611 AA. 

(REL. 14, CREATED) 

(REL. 14, LAST SEQUENCE UPDATE) 
(REL. 14, LAST ANNOTATION UPDATE) 


RAL2. 


SCHIZOSACCHAROMYCES POMBE (FISSION YEAST). 
EUKARYOTA,' FUNGI? ASCOMYCOT I N A ? HEM I ASCOM YCETES . 
C 1 ] 


SEQUENCE FROM N.A. 

900665.14 

FUKUI Y. MIYAKE S. , SATOH M. YAMAMOTO M. ; 



i in ur rvMOi rr\u i cain,. 


r-uiM^iiuiM; inruLrtitu 
DR EMBL; M30S27; SPRAL2. 

DR PIR; A33S27? A33827. 

SQ SEQUENCE 611 AA; 69347 MW? 1910377 CN; 

CC -!- Retrieved by alexk on Thu 25 Feb 93 10:42?00-PST using FastDB 

5 Optimized Score = 5 Significance = 

1007, Matches . 5 Mismatches = 

0 Conservative Substitutions = 

VVFGNKTRKLTGDYNLRGSNYDHIVFIELEGYGVYRKPGMGRVTERSEQLGKLLLNGISDMEILT 
330 390 400 410 420 430 440 

X X 
QAEVS 

I 1 I I I 

CLSRMLYKRWPAFQKIMDRAVEKNQEAFQAEVSQLGPGLTDLPFSSIH5TGSRALYMPYSFETCSAFLHYIY 
450 460 .470 430 490 500 510 520 

CGTLNGSYCTAKNLCNLLILCKGFEGLETFFAYIVHLLHGVLNRNNVKLIYETAALTGAKG 
530 540 550 560 570 530 


2-09 

0 

0 

IERMHIP 


Initial Score = 
Residue Identity = 
'laps = 


4. CELSA-4 
WMCVFM 


(1-5) 


ENTRY 

TITLE 

DATE 

PLACEMENT 
SOURCE 
ACCESSION 
REFERENCE 
^Authors 
#Journal 
#Titie 

. #Referenc 
#Acces sio 
#Molecule 
#Re sidues 
#Cross-re 
^Comment 

COMMENT 

SUPERFAMILY 

KEYWORDS 

SUMMARY 

SEQUENCE 

COMMENT 


Inclusion body matrix protein - Figwort mosaic vir 
WMCVFM #Type Protein 

Inclusion body matrix protein - Figwort mosaic virus 
30-Sep-1991 ^Sequence 30-Sep-1991 #Text 30-Jun-1992 
2377.0 1.0 1-0 1.0 1.0 

figwort mosaic virus 
SOI 234 

Richins R.D.r Scholthof H . B . , Shepherd R.J. 

Nucleic Acids Res. (1937) 15:3451-3466 

Sequence of figwort mosaic virus DNA (caulinovirus 
group). 

e-number S01279 
n SO 1234 
-type DNA 

1-512 < RIC > 
ference EMBL?X06166 

The translation of the 
given in this paper. 

This virus is a member 


nucleotide sequence is not 


of the group Caulimovirus. 

#Name caulinovirus inclusion body matrix protein 
matrix protein 

#Mclecular-weight 53207 ^Length 512 ^Checksum 1605 

Retrieved by alexk on Thu 25 Feb 93 10:42:00-PST using FastDB 


Initial.Score = 
Residue Identity = 
Gaps = 


5 Optimized Score = 5 

1007. Matches = 5 

0 Conservative Substitutions 


Significance = 2.09 

Mismatches = 0 

= 0 


NLTPKSDKDKVKS5PVANGSGKDSTNPLNPVALGKSKMTILGQKQADEEEFKPDYLRAASNGQSWFAVYKGP 
100 110 120 130 140 150 160 


X X 
QAEVS 

M I I I 

NKEFFTEWEIVADICKKRQK5KRFRSKEQAEVSISLYNKDIQDPVNFLRPVKLVKEERAAQPLKFKA IAAEG 
170 130 190 X X 200 210 220 230 


TIQFDEFRQIWEKSRLSBLEDGVQEKFYTNDSASKSTYTFVENAEPYLVHTAFRAGLAKVI 


— 


~,ut ■■■"■ 



5. CELSA-4 (1-5) 

IBMP^FMVD INCLUSION BODY MATRIX PROTEIN (V IRQPLASMIN) 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

oc 

RN 

RP 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

KW 

SQ 

CC 

I ri i t i 

Res i d 

Gaps 


STANDARD; 


PRT; 


AA 


CREATED) 

LAST SEQUENCE UPDATE) 
LAST ANNOTATION UPDATE) 
PROTEIN (VIROPLASMIN). 


DXS) (FMV). 

VIRUSES; CAULIMOVIRIDAE, 


IBMP_FMVD 
P09524; 

01-MAR-1989 (REL. 10, 

01-MAR-19S9 (REL. 10, 

01-AUG-1991 (REL. 19, 

INCLUSION BODY MATRIX 
VI . 

FIGWORT MOSAIC VIRUS (STRAIN 
VIRI DAE; DS-DNA NONENVELOPED 
C 1 1 

SEQUENCE FROM N.A. 

88040466 

RICHINS R.D., SCHOLTHOF H.B., SHEPHERD 
NUCLEIC ACIDS RES. 15:8451-8466(1987). 

-!- FUNCTION: ENHANCES THE TRANSLATION OF DOWNSTREAM ORF'S ON 
POLYCISTRONIC MRNA'S DERIVED FROM FIGWORT MOSAIC VIRUS. 

!- SUBCELLULAR LOCATION: CYTOPLASMIC INCLUSION BODIES. 

THE INCLUSION BODIES ARE THE SITE OF VIRAL DNA SYNTHESIS, VIRION 
ASSEMBLY AND ACCUMULATION IN THE INFECTED CELL. 

-!- SIMILARITY: HIGH, WITH OTHER CAULIMOVIRUS VIROPLASMIN. 

EMBL; X06166; CAFMVXX. 

PIR; SO 1284; WMCVFM. 

TRANS-ACTING FACTOR; TRANSLATION REGULATION. 

SEQUENCE 512 AA; 58207 MW; 1378914 CN; 

-!- Retrieved by alexk on Thu 25 Feb 93 10:42:00-PST using FastDB 


R. J, 


al Score = 
ue Identity = 


5 Optimized Score = 5 

100% Matches = 5 

0 Conservative Substitutions 


Significance = 
Mismatches = 


2.09 

0 

0 


NLTPKSDKDKVKSSPVANGSGKDSTNPLNPVALGKSKMTILGQKQADEEEFKPDYLRAASNGQSWFAVYKGP 
100 110 120 130 140 150 160 


X X 
QAEVS 
I I 


NKEFFTEWEIVADICKKRQKSKRFRSKEQAEVS ISLYNKDIQDPVNFLRPVKLVKEERAAQPLKFKAIAAEG 

170 ISO 190 X X 200 210 220 230 

TIQFDEFRQIWEKSRLSDLEDGVQEKFYTNDSASKSTYtFVENAEPYLVHTAFRAGLAKVI 

* 4 0 250 260 270 280 290 


6. CELSA-4 (1-5) 

A30828 Steroid 


17alpha-monoo>:yger.ase cytochrome P450 17 - 


ENTRY 

TITLE 

DATE 

PLACEMENT 

SOURCE 

ACCESSION 

REFERENCE 


A30828 #Type Protein 

Steroid 17a 1pha-monooxygenase cytochrome P450 17 - 
Rat #EC-number 1.14.99.9 

19-May-1989 JSSequer.ee 19-May-1989 #Te>:t 10-Aug-1992 
0.0 0.0 0.0 0.0 0.0 
Ra11us norvegicus #Common-name Noruau rat 
A30828\ A31359 


#Authors Dufau M.L. 

#citation submitted to G e n B a n k , December 1988 ■ 

#Reference-number A94511 

#Acce 5 s i or. A30828 

#Molecule — type mRNA 

#Residues 1-507 <DUF> 



iffrtutnors 
#Journal 
#Ti tie 


mamiKi n.r rMLariura n . > c - l/ut au n . l_ . 

Biochem. Biophys. Res. Commun. (1988) 157:705-712 

Rat testis P-450-17~alpha cDNA: the deduced amino 
acid sequence* expression and secondary structural 
configuration. 


#Reference-number A90154 
#Accession A31359 

#Molecule-type mRNA 
^Residues 1-507 <NAM> 

SUPERFAMILY #Name cytochrome P450 

KEYWORDS endoplasmic reticulumX heme\ membrane proteinX 

monooxygenaseX oxidoreductase 


FEATURE 

2-21 

169-186 

441 

SUMMARY 

SEQUENCE 

COMMENT 


#Donain transmenbrane <TM1>\ 
ttDonain transmenbrane <TM2>\ 

# B i nding-site heme iron (Cys) (axial 
ligand) (predicted) 

#Molecular-weight 57250 ttLength 507 ^Checksum 9025 
Retrieved by alexk on Thu 2.5 Feb 93 10:42:00-PST using 


FastDB 


Initial Score. 
Residue Identity 
Gaps 


5 Optimized Score = 5 Significance = 2.09 

1007. Matches = 5 Mismatches = 0 

0 Conservative Substitutions = 0 


ENEWDQPDQFMPERFLDPTGSHLITPTQSYLPFGAGPRSCIGEALARQELFVFTALLLQRFDLDUSDDKQLP 
410 420 430 440 450 460 470 

X X 
QAEVS 
MM! 

RLEGDPKVVFLIDPFKVKITVRQAWMDAQAEVST 
480 490 500 X X 


7. CELSA-4 (1-5) 

^Steroid 17alpha-monooxygenase - Rat ^EC-nunber 
S16719 #Type Protein 

♦Steroid 17a 1pha-monooxygenase - Rat #EC-number 

I. 14.99.9 

15-Jun-1992 ^Sequence 15-Uun-1992 #Text 15-Jun-1992 
0.0 0.0 0.0 0.0 0.0 
♦This entry is not verified. 

Rattus norvegicus #Common-name Noruay rat 

Fevold H.R.* Lorence M.C.r McCarthy J.L.* Trant 

J. M.* Kagimoto M.* Waterman M.R.> Mason J.'I . 

Mol. Endocrinol. (1989) 3:963-975 

Rat P450(17-a1pha) from testis: characterization of 
a full-length cDNA encoding a unique steroid 
hydroxylase capable of catalyzing both Delta(4)- 
and Delta(5)-steroid-17*20-lyase reactions. 

#Referenee-number S16719 
^Accession S16719 
#Cros5-reference EMBL:M31681 

SUMMARY tfMolecular-ueight 57250 ^Length 507 ^Checksum 9025 

SEQUENCE 

COMMENT Retrieved by alexk on Thu 25 Feb 93 10:42:C0-PST using FastDB 

5 Optimized Score - 5 Significance - 2.09 

1007. Matches = 5 Mismatches , - 0 

0 Conservative Substitutions = 0 

ENEWDQPDQFMF'ERFLDPTGSHLITPTQSYLPFGAGPR8CIGEALARQELFVFTALLLQRFDL DVSDDKQLP 


Initial Score = 
Residue Identity = 
Gaps “ 


S167 1 9 

ENTRY 

TITLE 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
#Author s 

^Journal 
#Title 




- 1 .- 3 .^. 


•1 r "i 



X X 
QAEVS 
I I I I I 

RLEGDPKVVFLIDPFKVKITVRQAWMDAQAEVST 
4S0 490 . 500 X X 


S. CELSA-4 (1-5) 

CPT1_RAT CYTOCHROME P450 XVIIA1 (P450-C17) (EC 1.14.99.9) ( 


ID 
AC 
DT 
DT 
DT 
DE 
DE 
GN 
OS 
OC 
OC 
RN 
RP 
RM 
RA 
RA 
RL 
RN 
RP 
RC 
RM 
RA 
RL 
RN 
RP 
RM 
RA 
RL 
RN 
RP 
RM 
RA 
RL 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
' DR 
DR 
DR 
DR 
KW 
KW 
FT 
SQ 
CC 


CPT1_RAT STANDARD; PRT; 507 AA. 

P11715 ; 

01-0CT-1989 (REL. 12, CREATED) 

01-AUG-1990 (REL. 15, LAST SEQUENCE UPDATE) 

01-MAR-1992 (REL. 21, LAST ANNOTATION UPDATE) 

CYTOCHROME P450 XVIIA1 (P450-C17) (EC 1.14.99.9) (STEROID 17-ALPHA- . 
HYDROXYLASE/17,20 LYASE). 

CYP17 . 

RATTUS NORVEGICUS (RAT). 

eukaryota; metazoa; chordata; vertebrata; tetrapoda; mammalia; 
eutheria; rodentia. 

C 1 ] 

SEQUENCE from n.a. 

89295447 

FEVOLD H.R., LORENCE M-C., MCCARTHY J.L., TRANT J.M., KAGIMOTO M., 
WATERMAN M.R., MASON J.I.; 

MOL. ENDOCRINOL. 35968-975(1989). 

C 2 ] 

SEQUENCE FROM N.A. 

TISSUE=TESTIS; 

89076306 

NAMIKI M., KITAMURA M., BUCZKO E., DUFAU M.L.; 

BIOCHEM. BIOPHYS. RES. COMMUN. 157:705-712(1988). 

C 3 3 

SEQUENCE OF 271-507 FROM N.A. 

88280759 

NISHIHARA M., WINTERS C.A., BUZKO E., WATERMAN M.R., DUFAU M.L.; 
BIOCHEM. BIOPHYS. RES. COMMUN. 154:151-158(1988). 

C 4 ] 

SEQUENCE OF 273-507 FROM N.A. 

90046678 

MELLON S.H., VAISSE C.; 

PROC. NATL. ACAD. SCI. U.S.A. 86:7775-7779(1989). 

-!- FUNCTION: CYTOCHROMES P450 ARE A GROUP OF HEME-THIOLATE 

MONOOXYGENASES. THEY OXIDIZE A VARIETY OF STRUCTURALLY UNRELATED 
COMPOUNDS, INCLUDING STEROIDS, FATTY ACIDS, AND XENOBIOTICS. 

-!- CATALYTIC ACTIVITY: A STEROID + AH(2) + 0(2) - A 17-ALPHA- 
HYDROXYSTEROID + A + H(2)0. 

EMBL; M31681; RNP45017. 

EMBL,' M22204; RNP450X. 

EMBL; M21208; RRCYPC17. 

EMBL; M27282; RNP450C1. 

PIR; A27659; A27659. 

PIR; A30S28; A30828. 

PIR; S16719 ; S16719. 

PROSITE; PS000S6; CYT0CHR0ME_P450. 

ELECTRON TRANSPORT; OX IDOREDUCTASE; MONOOXYGENASE; MEMBRANE; 

HEME; STEROIDOGENESIS. 

BINDING 441 441 HEME. 

SEQUENCE 507 AA; 57250 MW; 1347075 CN; 

-!- Retrieved by alexk on Thu 25 Feb 93 10:42:Q0-PST using FastDB 


Initial 3 c ore 
Residue Identity 


o 

1 007. 


Optimised Score 
Matches 




Significance 
Mi snatches 


2.09 

0 



E h; tr. W U l'i I**' I; Q f-* M P E R f L H p J R ; i : t t & y o w; r f r r , . - _ - 

, n < n . ." - * ! ~ -* f L - r ~ i\>. * 

0 4 20 430 440 

X x 

QAEVS 

F( L E b Ij P K VVFLIDPF K V K I T V R Q A U M n A Q A E' V A T 
4 3° 4 90 500 X ' ' X 


i L A R Q E L F V E T A! : S 0 P F P J v* 
450 460 >,! 


; 3 DDKGL 1 

l '*f / 0 


CEL5A-4 (1-5) 
•JI-10594 


VeSOaCtiVe ir,testlnel Polypeptide receptor- precur. 
JH0594 #Tupe Protein 

. ^-active i^testiral polypeptide receptor precursor 

17-Ju 1 -1992 ^Sequence 17-Jul-1?92 #T.-/ Kt 1 7 -j u , _ , 

O.u O.Q o,0 o.O 0.0 1 

Ratt.us norvegicus #Corcmon-r.am« wl.,-.. .. _* 

JH0594 " is..., uay rat 

(L u n g) 

I shihara r. , Shigemoto R . . M,-, r -: k r^-h- ^ 

Nagata S. Y '‘ ' 3f>d5hi K « •- 

Neuron (1992) 3!81 1-319 

Fune t i >_!n a 1 ^ x n r —* s s i ~j j. . . , 

• ■r> and tissue c nt-Mhuf _ c . _ 

novel rsrspf f nr . . . * l - • 1DUu '- lfl •-'* a 

f jr ^^uclive intest.inal 

polypeptide. 

? 7 e 1 - r *n r -e-nunber JH 0 59 4 
^Accession JHQ594 
* Mo 1 ecu 1 e-type nRNA 
#Residues 1-459 I S H •• 

^Cro 55 -reference GB:M36375 ' 

FEATURE 


ENTRY 

TITLE 

date 

PLACEMENT 

SOURCE 

accession 

reference 

#Author 5 

# J o u r n a 1 
#Ti tle 


1 -30 
31-459 

53 , 69 , 100,292 

146-163 
176-195 
2 13-241 
256-277 
295-318 
344-363 
376-395 
SUMMARY 
SEQUENCE 

COMMENT R 

Initial S c o r e = 

R e s i d ie I d e n t. i t y - 

0 Bp S „ 

Ci I H v VM--AFPPDNF 

3 h 0 


WD ‘^.fi n . si S r,sl sequence (predicted) 

1 . O I 1.3 > \ 


3 7 0 


ftProtei i- 

i va^uaci .1 v£■* i t,^* 5^1 r 

polype 

j pt i de j-ec •= 

- r> 10 r < r:i r a 0 

wdinaing-site carbohydrate ( 

X c o v a 1 

e nt,) (predicted) \ 

ft D o n a i n 

transriap-ibr 

a r: e •-.* 7 rt j \ 

: ft* D c< n a i n 

t- r- a r: s r f i e m k r 

ane <T*i?> \ 

D o ft a i n 

transnembr 

ane <TM3.;\ 

: ft D Ci ,*n a i n 

t. r- a n s m a & h 

are <TM 4 >\ 

# D o m a i n 

t- r a n s n e n b r 

are r r M 5 >\ 

: n‘Donai r* 

t a n s ri e r“: b r 

<1H6: \ 

■ft D o m a i n 

t* r - a r’i 5 ,‘ T : 4 : ;T, h ^ 

ane <TM 7 > 

■tght 520 

/ : ft L e rf iq t_ 

h 4 59 l?Ch 

y a 1 e >: k 

- 1 n I h u 2 5 

F e b 9 3 j, 0 : 

: Pi i f r ?i zed 

L3 C (.,< ' tc* 

b Si;: 

a t, c h e 5 

•= 

5 M, i : 

o n 3 e r v a L 

3. L'’ a* 1H r ] •; 

tut i cm 

VVG5FQGR■ 

/ V A I L Y C F L N: 

]EV 3AEL.RIF 

1 BO 

3^0 

4 0 0 


:ma’ 


„ o c* *r . , _ • _ 

! J * •*» 1 r.g t as t,D£ 

- aTlLH ~ 2 - 0 ? 

7 0 

:r 0 

• OGvl- •-•••.'tShSOlirW 


-i i n 


■ i t -i 4 


1AEVS 

l j j * j 




• I 1 1 1 






ENTRY 

title 

date 

placement 

COMMENT 

source: 

reference 

# A u t h o r s 


S16562 #Type Protein 

^nolF protein - Rhisobium melilnti 

r:r 9 T: s * q Tr u ;:r: 1 , :v T,xl 

•ihis entry is not verified, 
hhuobiun me l i \ o t i 


# J o u r n a 1 
#Titie 


B a n f a 1 v i Z . T 


B a e v N . , En d r e G. , P e t r o v i. c s G 
K o nd oro si A, 

lj ^ n ; Genet * 228! n 3-124 

ri0d!jl3tl0n genes of r , Cld bn>; , oc A . p , . , 

melllotl involved in nnrtulatinn c ^ ar ‘ 1Um 

production.' r.odM codes fo r 0 .. al . JC ' J: ’ b1 

synthetase. 3 yWusamine 

Perence-ntjFfber 3 1656 1 
# A c c e s s i o n S .1 6 5 6 2 
$ C r o s s - r* e f e r e n r & f N p; i -v^oiio 

SUMMARY „S1, ... 

=>EQUENCE . 1 vrL - - ,f, y - n 3C3? #Checksum 98 E 4 

COMMENT 

neti ieved oy alexk on Thu 25 F«b <?1 lO-A’-nn P r-. T 

• -t. vj lu . 4t. . Ou-PbT using FastDR 


5 Signifiranc e 
*- 1 M i s ! t i s i, c h 0 s 


a. 09 
o 
o 


I n i t i a 1 Score = lt n . . . 

r* . , *—■•* vJ jL ! "t- 1 /Tj i Z d R 

Residue Identity » 1007. Mates' ' 

a p S = n 

U .conservative Substitutions 

L1 1 i RR S Tt. T Sft YSft R i_ L T S CSDR G R G Q C RRtJppppp vr;-er. - ~ ^. 

70 SO 90 io!5 ts ^- ArtSrfUGCLSAflTEL AEAVLERNTRL 

ilu i 80 130 

X X 
QAEVS 

G E R G A A 3 E A T R L A A L A D V L D L R A H U R 2 k 4 A E V P n a r" r q i A r ,._ , _ 

140 150 jao X --A-nSt-odMc/rtMtPGGVIRARSVEEGQTVPLNTQLMTI 

" idL> 1P0 200 

VELNRLEVDAG V PT5RIPLIR1 kqawfi MU'-e-r.-r- 

210 220 • • ^-E.,,..urFuR,r- a GEVARISPTADAGSRAVRVFIA 

' 3U ,: ' 40 250 260 

11. CELSA-4 (1-5) 

N0LF_RHIME MODULATION PROTEIN WOLF. 


ID 

nolf_rhime st a 

AC 

.P251 96 ; 

DT 

01-MAY-1992 (REL. 

DT 

0 1 -MAY- 1 992 < REL. , 

DT 

01-MAY-1992 (REL. 

DE 

modulation PROTFTN 

GN 

•NOLF. 

OS 

RHI20BIUM MELILOTJ 

OG 

PLASMID SYM PRME41: 

oc 

P R 0 K A R Y 0 T A l Q R A C I L 

OC 

RM I 2OB I ACEAE .. 

RN 

C 1 3 

RP 

3b OUlNCE F ROM M . A 

i,_. 

STR A IN-AK 631?. 

RN 

’’ ! 360053 

r. A 

BAEV M., . FNDRE q = ... 

RL 

MOL,, OEM. GENET. 22 

C C 

- FUNCT f ON ; JMvOi 

*. \ - 

M-'OUl .'.Tinn sjrs 

Dm' 

- Mpl. , A oC >, 1 . ■ . 


PRT 


367 AA. 


- f-\ i 


ST 3 E Q U E N C E U F' DATE) 


• KOMDORDSI A.: 
UFLICAnO -SPEC l? U. 




L.* y 7! L 

! u i U i a 1 Scoria ^ 

tr 

ij 

Residue Identity = 

.1.007 

G a p s ~ 

0 


0 p> t i ! ! i i zed 3 c o r 


S i gr: i f‘ i r. snc e 
M i s m a t c h e s 


a. 09 


u-Tr- I *fiS1 u T5RVSRRLLTSCSDRGRG Q f:RRppppp R * q qr. Tr . r , A ..„..._, r _ 

70 SO 90 1 00*'" ^ Ct - SA ^ TELAEAVi - ERW ^^L 

120 130 

X X 
OAEV'S 

j j j | | 

G E <\ G A h 3 E A T R L A l A D V L D L R A H V R 3 K 3 A F V .3 p A E R ^ 1 p m a c- u c- * — r -• r ■ t - - 

140 150 'i* 0 v " ; 70 ''>'«tFuUv I rtAKbVEEGQTVPLNTOLMT I 

* 1/0 ISO 190 200 

VEL N RLEVDA«PTS K IPLIRLK 9 BVELHVEGFP G RTFS G EVAR I SP 1 -AD flG 5 R AVRVFlA 


d. H y 


250 


560 


12 . CELSA-4 ( 1 - 5 ) 

A27659 Cytochrome P450 17 - Rat (fragment) 

ENTRY A ^7 ASQ .i< -p -> 

TITLE r " -ffType t'rotei n ( F r a q m ent) 

tytochrom^ F^RO 17 - “ 

ALTERNATE-NAME c,t K hr OM P,^- 1 1-a , 

placement 31 To~ !98 n 31 - fl — 1 ■’*’ »T.»t io-au 9 -19, 2 

SOURCE R J: r . °:°. u '° 0-0 

ACCESSION A27 659 V ' 91CU! '' ^'-'Ornmon-name Norway rat 

reference 

#Authors 


#Journa 1 
#Tit le 


Nishihara M., Wirter-= r a r . , r- 

Dufau M.L. ' buiho E -' Waterman M.R., 

Eiochem. Biophys. Res. Commun. (IRAS ' 1 1 54 ; 1 5 1 _ » =• n 
Hormonal regulation of rat Le^,' „ : 1j ° 

p- 4 sn-i-?_,> rt ., , =’ ~ 1 = c •l' '!■ •- yti.<chr ome 

- ' . ' ip ‘ a ^^WA levels and character i 7 at ion .-• r 

» p .- el P df 'T,ial length rat P-450- 17 - a 1 D ha c nwA ’ “ 

•HF.ef erence-number A 27659 a "'" n CuNh. 

^Accession A27659 
Molecule-type mRNA 
#Residues 1-237 <NIS> 

Th !o aU ^? rS t ^ ar,Slatecl Lh - codc ' n GAT For re = i,Hu- 
L '’ 11 ■' ar,d FOS as Glu, A 5 n, and A«r. 

r e s p e c t ively , 

■ftN ame cytor hr ome p 450 

h e n w \ p t n o o x y g e n a s a* \ o x i d o r e d u c t a s e 


# C ci p\ n e n t 


SUPERSAMIL Y 

keywords 

feature 

171 


# B i n d i n g - 5 i t e hem* 

1 1 q a n d ) 

^Length 23 


iron (Cys) (axial 

#Check sum 3451 

Retrieved by alexk on Thu 25 Feh ^ j0 , 4o ,, , 

-- using Fast.OB 


SUMMARY 
SEQUENCE 
COMMENT 

initial S c n r -- = r- , , - 

~ *.J Q n T* 1 m ■> -T ^ -tJ Ci - - 

Re s i due Identity = 1007. Matched" 

u a p s 

w ton s er v a t- i 


^ b i q r; i f i c; a r - : c e 
rii 5 m a i c h e s 


2.09 

Sub g L i t u t i or: s ... ^ 

EMEND3PDSFNFERF! OPrcSiHLI rPT-iSVi PFRAPPPR.-.. . 

140 1 50 n i..-) . ‘ ' 'it. i.l.uR.- I ! LDV 3 £)r>i<QL 

J /,f . 1 -C 190 200 

X x 

QAc'YS 
! ! M i 


t !* > R P" r : t ! i f i- i 1 T*. r* (i - T - ^ • i . ! 

.. 1 .- 1 ^ A -f L - ✓ » t. i. / J i" t~ i .• ’■ J ; ■ 0 ( 7 . A, ,i TP, A -. . r 

■ * ■ ■ * c F ! ! : ij r J > J m i. / ■ i 


o o 





A J 3 9 8 0 

ENTRY 

TITLE 

DATE 

PLACEMENT 

SOURCE 

accession 

reference 

# A u t h o r 5 
J o u r n a 1 
#Tit 1 e 


otfe-r Uid l/alpha-mcmooxygena&e cytochrome F 450 17 - 
fvf? 980 ... #T yp* Protein (fragment) 

" " p ,y 1 r 1 ' ^ l P' r, d-PiOnooXygen ase cy toc hrome P45G 17 - 
m ^9ner,t; *EC-number 1.14.99.9 
31-Mar-1992 ttSequence 31 -Mar- 1 9 <jo * m a 

0.0 0.0 0.0 V'o 10-Aug-1992 

n d t b U 5 n T“' V Jb 4 J- I I - u _ 

A339S0 J Nc ' r "*« '--k 


Mellon S.H.. Vai^s r 

camp f 4 ' 7775 ^ 7777 

c-yc -<he,, lmide-insensitive nechanisn in'rultur-H 
* 0 r . • «ou»* Leydiq MA-30 C *U« * d 

r-K f ererice-nunber A33980 
^Accession A339S0 
^Molecule-type mRNA 
^Residues 1-235 <MEL> 

#Name cytochrane P450 
he.■!e ■, nonoo>;ygenase \ o>: i dor©duet as© 


SUPERFAMILY 

KEYWORDS 

FEATURE 

169 


SUMMARY 

SEQUENCE 

COMMENT 

Initial (3 c o r © 
Residue Identity 
Gaps 


■& Binding — sit© h e - e i r o i 
1 i q d n d ) ( p j--‘ n j dieted* 


i C C u ! 


‘.axial 


■n*Lo r?g t h £ 35 #Ch ©c k s un 7248 

Retrieved by alexk on Thy 25 Feh 97 I0:4^ 0 <-^T 

■“ ■ y k ' Ui 1 ^ using FastDB 


^ Opt i Fii^ed Sc or 


10QX Match© 


won nervaliv© Substitutions 


significance 
M .1 s pi a t c h e s 


2.09 
0 
n 


E N E W D U r‘ D Q F M p E R F L D P T G S H L I T p 7 & 3 v s p p p a r ^ n ^_ r , , 

130 140 150 ’ “wW' ' ^ ’" R ?^ Ut - M, - MKSeLFVFTALL l-QRFDLDVSDDKQLP 

,/U 180 190 200 


:< x 

0 A E V 3 

! I I ! 


RLEGDPK WFL I DPFKVK I TVRQaWMDAQAE'YST 

k-10 220 2 7n 


14. CELSA-4 (1-5) 
S.1 8659 

ENTRY 
TITLE 
DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
s : A u t h o r s 


■ft* ■ J.... i-i r r'; : >j 1 

It T i t 1 e 


f’i y c.op 1 asri a hyorh i t■ j ( sGC 3 


* P y P t. h © t i c a 1 p r o i e i n 

SI 8659 • #Typ© Protein 

* H yP°thetical prote i r . - m u - r .' - - , r - • 

oa-s.p-jyM *s eq ,oa-ll:,L:; , TC 8 i r u :T n ;',T„L' s ';L 3i 

o.w 0-0 0 « 0 0.0 " - ' ^ h‘ - 


0 - 0 

* F h 1 5 ent r ■ y i - n o t v-i-if : * h 
M 'j cop 1 as n a h y o r h i n i 5 

Yor 


V . u 


!d c ‘ v * 5 p\ O c . 0 1 j" t >1 a;- i' p ru O 1 i ~ i . . 4 , 


K 

EMD0 J. (191-11 i0‘.4C69-4O79 

V 1 , , ' 

b-nii of 

■' *3f i 3t i • -■ , - . \ *: r 

' j 3 *. i 1 '0 *j ^ , O ■“' L; y. . ; 1 -j . 

^ i 30:;- i- 


>■ Wish 


: *. ** ; f • , . /' , t _. r I . 


t r j ^ f ^ i* 

V A . _ 

* t :i l on 


0 (0 j i 

T r j 


o ^0:111^4 K- 3 ; ,3 0 ;5 j 

3 i 3 3 -7 9 


' ! C> f r- - r r 
o -p i o: 1 c: % 





COMMENT 

I f f i t- i ci 1. S c o r i? 
Residue Identitu 
G a p s 


R e t r- i e e d 


b y a 1 e v, k o n 


nu 


5 Optimised Score 
1 0 0 % M a t c rt e s 

0 r o n 5 e r v a t j v 


*- D f J- ; 4 -■ ’ 0 I - ■ P SI u s 1 n q F a s f, n p 
5 S i q n i p i r a n c e - 


Substitutions 

A X 
QAEVS 


M i s n a i c h e 


■ 09 
0 
0 


VDKLIQINHNNGDLV'HLQyEINLEGQQVDIyDQEHLQAEVSLEOPVVLfcQQGVNKQKLLLNLSNQDHKKQLL 

L J ,J X 4 0 5 0 .£) Q 0 

NL 9 N8DHKK a i.LNL a N«DHKK 9 l.LNL8N8DHKKaLLNl.aN«DHKK«UNL8NaDHKK'LLKKKL9TKNPI 


1 10 


l 2 0 


.13 0 


1 40 


ID 

AC 

DT 

DE 

KW 

KW 

OS 

OS 

FH 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT- 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

PM 

PD 

F'F 

PR 

PR 

PR 

PA 

PI 

DR 

DR 


CELSA-4 (1-5) 

R20793 COR-gi-.ft«d/hu»«nij»d heavg chain gHl.- 

R20793 standard; Protein? 146 AA 
R 2 0 7 9 3 ? 

19-MAY-1992 (first entry) ' 

CDR-grafted, humanised heavy chain gh'l. 

nunrie monoclonal ant i bedu ;• mah- ^ r - 7 .' ■ , 

coi-ip l ement ar i ty ' hU "‘*" 1 •"“•”>«*> CEA ; 

Homo sapiens. 

M u s musculus. 

** y ' Loc ation/Qual if iers 

Peptide i , a i 9 

/1 a b e1= signal 
Protein 20..146 

/label= VH 

/note- humen LA/ framework with ’ AS E 7 C n R= ,! 

Region 45..54 

/label= CDR1 

/note = "murine r e sidues “ 

Region 69..Q7" 

/label= CDR2 

/note = "murine residues" 

R e gio n 120..129 

/1abe1= CDR3 

/note — "murine residues" 

Misc_difference 20 
• / n o t e = " m u r i n e r e s i d u e " 

Misc_difference 67..6S 
/note= "murine residues" 

Hi S c __ d i f f a- r e nee 94. .9 5 
/ n o t. e = " m u r .i n e residue s " 

"1 i s c _ d i f F e r e n c e 9 8 
/note= "murine residue" 

Mis c _dif Perenc e iig 
/note- "murine residue" 

W09201059-A. ' 

23-JAN - J. 992 „ 

0 5 - J U L - .1 9 91 : G 0 i 1 0 g, 

05 — J U L - 1 9 9 0 ; f, R - n j 4 Q 7 - j 
21-DEC-1990? W0--G0 20 1 7 [ 

Oo-JUL-1991; W0-G0i108. 

(0ELL- > CELLTECH LTD. 

Adair JR, Bodmer MW- Mountain n^., :C r , { 

P I ; 3 £ - o 5 6 £ ‘7 4/07, . 1 " 

N -P30PrQP0937. 

: ■- J.’h : ~p aP v j i -• i . 

-- th/r-Di, -r:: ? . /, V *' 1 - " : ; 




cc 

cc 

cc 

cc 

SQ 

SG 

SQ 

CC 


t i_j v y u m & i. r i y g Li 6 1 n t.. ^ r* r« q n f~. ~j q ^ r- u-w * ^ r% . 

d?f?ir?" (s-r 9 IS»pr?;r‘A-? 9 -*" a **. “ hiCh h^'^rand ASBy 5a q u*r,c es 

S.qu.nc. I 46 AA 7 y Chalr ' codin 9 

3 M 14 L; 6 K- 3 rn 7 fI 3 p! f 4 ^ 13 V 2 J* ‘® G! 1 Hf 

Retrieved by on Thy B5 Feb 93 10 1 42 , oUst J „ s i ’ 9 ' F V ’ 


I n i t i a I Sc 
Residue Identity = 
Gaps = 


5 Optimized Score = 5 qi Qriif;r , rr . _ 

. luv A M a t c h e s ‘*r M • , . 

n _ u Mismatches = 

U Conservative Substitutions 


.09 

0 

0 


LFFLSVTTeVHSEVeLLESGCCLV^PGGSLFLSCATSGFTFTDVVHNWVRaAPGKCLEWLGFIGNKANCVTT 

Ju 4u 50 60 70 


X X 
QAEVS 


EYSASVKGRFTISRDKSKSTLYLQMNGLGAEVSAIVYf 

30 ™ i 00 . no X 


trdrglrfyfdywggg 

120 130 


TLVTVSSASTKGP 
1 40 


16. CEL. S A-4 (1-5) 

506727 .Hypothetical protein 1 („ i r, i c i rc 1 e! - Streptonyce 


entry 

title 

DATE 

PLACEMENT 

COMMENT 

SOURCE 

REFERENCE 

# A u t h o r s 

# J o u r n a I 
#Title 


S 0 6 7 2 7 #Type Protein • 

,rt coelicnior* ' >rot * iB 1 <"inieirel♦> - Streptonyces 

lS-Jun-1792 #Beqyence 13-Jun.-I992 #Tevt 1S-Jun-1992 

u - J 0 “ 0 0-0 0-0 0-0 
*This entry is not verified. 


S i reptonyces 


: o e 1 i c o 1 o r 


Henderson D.J., Lydiate D.J., Hopuood D .A 
Mol. Microbiol. (1989) 3? 1307-1318 
structural and functional analysis of the 
c * ? ^ ire 11 -. g t r a n s p o s a b 1 e e l e m e n t o f 
c , „ ■ - r ^Ptonyces coal ico 1 or A 7 < 2 ) 

fi-K m t erence-nurcber S0 6~/?7 

Access! on BQ 6727 

#Cross-reference EMBL: X 1 5942 

SEQUENCE *"0<»cul.r-w,i 9 ht 13374 length 122 

Retrieved by alexk on Thu 25 Feb 9 


COMMENT 


ftChcc: k =U.m 92 4? 


Initial Sc o 


I 0 : 4 d ■ 0 1 - P S 7 u s i n 


:■ r y 


C* • t _ - wf-' u i ri i v„ 

Residue Identity = 1007. Matches 


Optimized Score 


Gap s 


0 u*_<n serv at i ve Subs t i tut i. or 


5 Siq n ific ance = 
b Mis ^ atch*s ~ 


g Fast0£ 

3 - 09 
0 
0 


QAEVS 


M ILLtjLVQHMAEVERNWFQRvFAGQM l,c,c,, l ' , P / **c'^:-i'■--*-r-- ■ - i i ! ! ! 

10 S ' U “-^■’‘'uFrthsuF.CLDEAVAANaAEVSRGREl 1 ADAGt DP 

=« > 60 70 


uL H L.S fc.a EA G H VGD G GVSLRW I MVHli I EE YARHNGHA™ ' i nr t T i - . 

30 ?o 100 “no"*'"""‘“20 


17. C FL S A - -4 { i -.5 ) 

YH ' 2 -Sift n "'mini-circle Hvpi/n 







DT 

DT 

DE 

OS 

OG¬ 

RN 

RP 

RC 

RM 

RA 

RL 

DR 

DR 

KW 

SQ 

r.r. 


. •* ' i't 7 L-Kh A ! bD ) 

01-APR-1990 <R'£L. 14, LAST SEQUENCE UPDATE) 

1 n F R 1 ? 7 0 (R E L. 14 > L A S T ANNOTATION U P DA T F' 

S^^i2^. HVPOrHETIC '' L »-3 *» PR0)EIN? Art ' 

SrREFTuMYLtS COELICOLOR. 

PRDKARYOTAf FIRMICUTES; ACT] NDMYCET ALES i STREPTOMYCETACEAE . 

S E \'j U t N C E FROM N A . 

STRAIN=A3 ( 2); 

90136057 

HENDERSON'D.J., LYDIATE D.J., NORWOOD D . a. : 

MOL. MICROBIOL. 3:1307-1318(1989). 

EMBL; XI5945; SCMINCIR. 

PIR; SO6727; S 0 6 7 2 7. 

1 FiANSFOSABLE i— ! F M I- M T j . UVCihti ir-r r -’-i a • —___ _. 

SEQUENCE 1 


I n x t i a 1 Scot?, = 
Residue Identity = 
Gaps = 


AA ? 

13374 MW; 65 

by ale 

xk on Thu 25 

5 

Optimised Sc 

1007. 

M a t, c h q £. 

0 

Con servaiivs 


5 S .i q n ificance = 
5 M i 5 m atches . = 


.09 

0 

0 


X X 
QAEVS 

HTLLGLV.HMAEVERNWFSRVFAGaNVPPVFCESWHDGFALKSGRGLDEAVAAW.Ui^RGRELiADAGLDD 

" 40 50 X 60 70 

SGHLSEGEIAG'H VGDQGVSL.RW I MVHMI EE YARH-NGHADL I RFQ ■ DRTTQA 
80 100 110 .t20 



co to comozrncomcn ^ rn 


0 j j 0 X n 1 e 1 1 i G e n e t- ks 

> G < 


Best Available Copy 


Fast D B - F a s i P a i r wise C o n p a r i s o n o f 3 e q u e n c e s 
Release 5 „ 4 

R e 5 u 1i s f i t e c e Isa — i 4 - a i. i g n . r* e s n a d e b y a 1 e x k o n T h u 2 5 F e b 9 3 1 1 $ 1 5 ■ 3 0 - P S T 


Q u e* r y s e q u a’ n c: e b e i n g c o r p a r e d ' 
N u n b e r o f s e q u e rt r: e s sea r c h e d - 
N u r b e r o f s c o r- e s a b o v e c u toffs 


CELSA-14 U-5) 
50 
30 


Resu 11 s of the i rt j. f i a 1 c onp ar i s on of CEL3A — 1 4 ( 1 -5 ) w i t,h s 
File s cel sa-14»pe d 


100 - 


N 

U 

M 

B 


50- 


10- 


j — 


! I ! I 

CORE 0ll£ 
IDE V 


PARAMETERS 


S i n i 1. a r i t y n a t r* I x 

U n i i a r y 

K -1 u p 1 

M iscia tch pena 1 f y 

er 

U 

J o 1 n i n 

Gap penalty 

i . 00 

t-J i n d u ijj 

Gap si z e pe n a 1iy 

0,26 


C u t ci f f s c o r e 

0 


Rand on1 za t io 1 1 q r o up 

0 



b e Ci r e s 


SEARCH STATIST 
N e a n H e d i a n 


i i pi e 


0 0 


C F ! u 

o; o 


.> 


M < j n b e r 
M u n b e r 
N i.i n b e r 


c f r as i. due 


< .* i -• r q - i . j ■ 1 *. 
Of Si" or-:. 


I o.f 


439 3 9 


Si «r.»d and Devi at j on 
0 ., 00 

T o t .* i L i *:t r e d 

00 ?00i0! .00 


o r*j 



-•> u l s_ u i. 


''Best Avaifa^ble'C8f» ! y 1 n 1 *- 3 a 4 5 c o r e . 

50 lu0% similar sequences to the query sequence were found: 


Sequence Name 

1. SI 3268 

2- ACV5_N0CLA 
3 . S 1 2 3 3 2 
4. UB R1_YEAST 
5 . R R W G N V 

6 . VDR1 NMV 

7. WMWGPV 

2. S14005 
9. V0R1_PVX 

10. V 0 R1_ P V X C P 

11. V0R1 PVXX3 

12 . VGIHJ2 

13. VGL2_CVM4 

14. VGIH59 

15. VGL2._CVMA5 

16. VGIHMJ 

17. VGL2 CVMJH 
IS. A40986 

19. SAHU4F 

20. A 3 9 9 S 4 

1. CELSA-14 <1-5) 

ir , 


Descriptio n 


N 


* a 1 P h a -Ami n o a d i p y l - i. - c i, 

L - I A L R H A - A MI N 0 A D I p Y L ) - L 
U b i q u i t i n p r o t e i n l i q a s e 
N-END-REC 0 GNI 2 ING PR 0 TE 1 ' 
R N A - d i r e c t e d R N A p o 1 . u m e ■- 
186 KD PROTEIN i 0 RF 1 ) . 
RNA-directed RNA polymer 

# Hy|uothet i c a I proi. e n . j 
165 KD PROTEIN' CORF i). 
165 KD PROTEIN CORF . 1 ). 
.165 KD PROTEIN (GRF 1 ) 

E d a 1 U r n r -1 r-i c 

E 2 
E 2 
E 2 
E 2 
E 2 

*M-cadherin - Mouse (f r . 
Cell surface a ntig en 4 F; 

* t e1 1 surface a ntig e n 4 F 


9 1 y c o p r o t e i n p r ecu r s o r 

glycoprotein p r e c ur S 0 R 

g I y s_ o p r o L e i n pi r e c u r s o r 
'-»L YCOF-'ROTE I N PRECURSOR 
g 1 y c: o p r o t e i n p> r e c u r s o r 
GLYCOPROTEIN PRECURS 0 R 


S1326S 
ENTRY 

title ■ 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
# A u t h o r s 
■H-C i tat i o n 



L ength 

Score 

- — -- 

- ■* 

— 

•* - i ^ y 

364 9 

O 

:yste 

3649 

•-J 

■ - Y 

1 950 

5 

! (UG 

19 50 

0 

t se ~ 

1 6 4 3 

5 


164 3 

ET 

S tz* — 

14 56 

ter 

6 K * - 

1 4 56 

cr 

.J 


1456 

5 


1 4 56 

~i 


1 456 

tr 

- M 

t 3 7 6 

5 

(5 P 

1 3 7 6 

rj 

-■ $4 

1 364 

tr 

(BP 

1364 

‘ j 

* M 

1635 

5 

(BP 

1 2 3 5 

\3 

0 bM- t- 

730 

5 

v 

569 

5 

he a 

569 

b 

1 1 ~D~ 

v a 1 i. n e 

s y n t h e 

iy l-D 

" v s 1 1 r i e 

synthn 


I n i t, Opt,. 

Score 


o 

5 


o 1 8 2 to 8 $ 1 y P ; e P r o t e i n 

^ a 1 p h a — A m i n o a d .i p u 1 — L — c y s t e i i 
S t r e p t- o m y c e e lactand u r- a n s 

°""o?r l,9 o.r s,qu o?r os ;T' I ’- 8 -* T * xt 

*This entry is not verified, 
Streptomyces 1 . actamdurans 


0,0 


•- 1 1 y ® P r a» 


0.00 
0 . 00 
0.00 
0 . 00 
0.00 
0.00 
0.00 
0.00 
0 . 00 
0.00 
0.00 
0 . 00 
0,00 
0 . 00 
0 , 00 
0,00 
0.00 
0.00 
0.00 
0,00 


0 

0 

0 

0 

0 

0 

0 

u 

0 

0 

w 

0 

o 

0 

0 

0 

0 

0 

0 

0 


Mar t in -J.F. 

submitted to the EMP 1 natai-iu-*-... ■ 

•#FVef erenc e-number S13268 ' lbr ' df ’^' ' jam,ar U 1991 

Recession SI 8268 
•ffCros 5-reference EMDL : X 57 T 1 0 
SUMMARY # M o 1 e c uI ar-u* i q ht 4040*, . 

SEQUENCE d K ' 0 fe 4 * Lvt , gth j* 4 g #ch .. k „ :J£2?& 

COMMENT s et ,. 1BVO , , 

" - BV 'd on Thu 25 Feb 93 10^01-PST using FeetDE 

= 5 0pt i mi 2 e d Sc o r 

= 100/; Matches 

~ 0 Conservativ 


Initial Sc o re 
Residue Identity 

G <5 pi £ 


.5 S i g n i, f i c g n c e 
- 5 Mismatches 

SLib a Li tut i c ns 

A L d to rt D R A Y V T V T S G T T G V P h G U p K y h y 3 y 1 / k,'c r t :v "rr- ., c , - _ . .. _ _ 

400 4 i 0 4.-1 ,0 ' 11 ASYVPEPHLRQTLt AL iM 

4 50 4 6 o 


0.00 

0 

0 


VLNAT 

EQTL VJ VPDDVRLDP r U. FPE .* I ERHRVTV;'. MA‘! GSw, ..OiPn: • r- • . . ., .. 

4- u -IRQ 4*/{; : -- : n v "''‘'I .•''••vv;. .7..,, 


[V * ‘.J 







CELS A - .1 4 (i - 5 ) Best Available Copy 

AcVS_NOCLA L~ (ALPHA-AM I NOA'D I PYL ) --L 


ID 

AC 

DT 

DT 

DT 

DE 

DE 

GN 

OS 

OC 

RN 

RP 

RC 

RM 

RA 

RL 

CC 

CC 

CC 

CC 

CC 

CC 

CC 

DR 

DR 

KW 

9Q 

CC 


ACVS_NOCLA 
P27743; 

0.1-AUG-1992 (REL . 21 

01 -AUG—1992 IREL, 23, LAST SEQUENCE UPDATE:) 


STANDARD; r RT; 

CREATED) 


3TE I NVL-D- VAL INE SVNTHE' 
3649 AA - 


01 -AIT- 1 OQo <vc, r "- ! UPDATE 

U1 AUG 1992 (REL. 23, LAST ANNOTATION UPD A T F) ' 

L-fALPHA-AM'fwnAntDu. > , --- - J " WLA,t -' 


L - (ALPHA-AM I NO AD I PYL ) -L- CYS TF T NYi -n-u •' i r m - 

(ACV SYNTHETASE) (ACVS) ** * ^--Nt oYNTHtiASE (EC 6 .-.-.-) 

PCBAB. " • 

NOCARD IA LACTAMDURANS 

.prokarvota) firmicutes, actindmvcetales,- nqcardioforh. 

SEQUENCE FROM N „ A „ 

STRA IN = VAR LC 411; 

92065308 . 

S n T E M^; R - MARTIN J - F " j CALZADA J.Q., . IRAS p . 

MOL. MICROBIOL. 5 ; 1 1 25- 1 1 33 <1 99 1 ) . ^ F * 

'' A5^AMI0NACYL-ADENYLATFS^WITH'^pf*n~ j-)~ H,,. AG *^ G ° F ACV ARE ACT! VATED 

-• PA^NAV PATI0N ° F AM1N0 ACID THi0LESTFR U T^ E pMpn^.,.^ R ° UGH THE 

• F A i H \4 A y' J FIRST qtfp t m rur T - . _^ H i l. r\,,L. w L « . t - „ 

- r * n o i s I hr IN THE E i DRYWTMPc; t -q nr nr-ki r ^ r , . 

CEPHALOSPORIN. 0r " FtNluILLIN AND 

~ ■ ~ SIMILARITY: 'TO OTHER FN7vme:o hut-,, 

EMBL-°X- 3 ?J BINDING OF AMP^TO^HEiR^SUBslRATE ^ ATP “ DEPENDEN T 

EMBL.. X57310) NLPCBABC. ’ tl 

PIR; SI 8268; Si,3263. 

"^n^L ANTIBI0TIC 1 SSYNTHESIS; DUPLir A TiP N 
SEQUENCE 3649 AA; 404079 MW; 2.019611^07 r N • 

•- Retrieved by alexk on Thu 25 r* b 07 " • , ’ C , T 

’ ,w A ^ Ut.-Pb r using FastD B 


I n i t i a 1. Sc o ne 
Residue Identity = 
Gaps - 


w 

1007. 

0 


Optimized Sc Or- = rr q • - r . . 

Matrhpc .... ; Jignxf icance - 0,00 

p _.. _ _ . . ° n i s n a t c h e s = 0 

Cunst-rvaiive Substitutens = 


^r RERAV ^lr GTTGVP ^---%-----r---TE R VMP AS VVEEPHLP a TL IA L IN 

440 *50 460 


X 


x 


YLNAT 


470 flTLVIVP °80 RLDPI!LF 4?o I E RHGVT ioo^X SSVL0 2 RELRRCAS L -? LLLVGE ELrA35LR8l.REKFAG 

*" -* 1 uf J. w —, ',-J f) *rr ~7 

- J .J 3 U T /t n 


R V V* N E Y A F 7 E. A A F V T A U w F p p * p r* *; t- c n d n r-. T - r - _. 

5S o T ' 36 5'• rtavT “S55" B,u ' <PL ;i!Ty vLsa ‘:!:? aLpic -'‘«Lyi 


40 


530 


590 


600 


3 . C E L 5 A - 1 4 ( 1 - 5) 


S 1 2 3 3 2 

ENTRY 

TITLE 

DATE 

PLACEMENT 
SOURCE 
A C C E: 3 S I 0 N 
R E F E R E19 C E 
T A: i L h o •■■■ c 




U b i q u i t, i n - - p r o t e i n H g a s ■ 


^ e a s t 


*. S 6C c!'* r*o pm jr i-. =, 


o e c / n , i; • o n u c e s 


s 1 2 3 J 2 * T y p e p r o t»i n 

Ub i qt i i t i r, — protei n \ agss , 

, f ‘ ^ >-ev ^ - *■ ** •' "SC -number- 6.3.2. ! o 

I 6 “■ B e c» - l v 9 ~ j J; ^ * i - ^ - j / _ w " 

n Z : ^ 16-ac-p-1992 ;:r..v{ :o • 

\j o 0 {) n f .. * "N a 

Sacchar-oi’iLiC^. ^ 

s i B 3 3 ?: 


*-*• a r t e ! 
EiNTO ■: 


. { ; 91/ ,9 , 9 - B 1 - 7 ■/ - *< i f 




- - --• ‘ - ^ p U t J t~\ 

^Residues ' 1 t 1 950 e ^^#i lable Copy 

C (■' O 3 5 *~ f-" 0 T 0 ['■• & j-- ( £ & p fvj jC;t^ »■ )(*5 7 4 

GENETIC " ‘ 

'H Map-positi o n 15 

#Wame UBRJ. 

SUPERFAMILY #Name ubiguit in-nrrt-^ , • 
s“ S P-ot eir , d4'ad a t.on ! 

SEQUENCE krt ‘>«“>»'-<«i9ht 28483* #L.r, 8 th 1950 »Ch« ksu „ 76 *» 

co»«en T — by aleKt on Thu ss F#b „ 10129I02 . P3T Uilng F „ tD8 


0.00 
0 
0 


Initial Score = 5 Optimized Smre = - — 

Residue Identity = 100% Matches _ ■ "! significance 

baps — n n ” j Mi snatch&s 

0 conservative Substitutions 

330 ” ' L ' VFDIJ 340 DUL ' SRI ^^ brtAr!LK CSQDLSSV L GGF F AVa T NGLSA TLT SWSEYLHOETCKYIIl 

J6U 3/0 3S0 390 

X X 
YLNAT 

■ ^i HcLNip ^fr TrFRNfi ^s TLc3E i4y EcRDMTp Y---^— R v IDLS[L . ADGM0 ip L 

X 4 ^° 440 450 ■ 460 

47 UHHKIL(.E88TH8L8«.INBVET8T8RTYSNTm.*HILVEDNRyHKItl.W0I8 ( IVnm» 

j0 ° 510 530 530 


4. CELSA-14 (1-5) 

UBR 1_YEAST N-EMD-RECOGNIZING PROTEIN <UB>.QUITIN-PROTEIN LI GAB 

UER1_YEAST STANDARD; 

P 1 9 8 1 2 f 

Ol-FEB-1991 (REL, 


ID 
AC 
DT 
DT 
DT 
DE 
' DE 
GN 
OS 
OC 
RN 
RP 
RC 
RM 
RA 
RL 
CC 

cc 

CC 

• cc 

cc 

DR 
DR 
K W 
SO 

c c 


:, PT; 1950 AA. 


(REL.. 

17, 

CREATED) 

(REL . 

.1 7 , 

LAST SEQUENCE 

(REL . 

19, 

LAST ANNOTATI 

I ZING 

PROTEIN (uBIQUIT I 


REC0GNIN)T" , “*” W r " U,Ci " ^aiQUITIN-PRGTEIN LI CASE E3 COMPONENT) < N - 
UBRl. 

?S^AR?n ?A- C pL-f EVISIAE * BAKER ' S YEAST). 

r. L t ARY0TA r FUNG I ? ASCOM YCOT 2 NA ». HEMI ASCOMYCETES . 

SEQUENCE FROM’ N,. A. 

STRAIN=GRF38 (SE88C)? 

9 1006011 

^. e ;; 3 ^S<^ 0 r sHAVSK¥ 

PATHWAY.'” UBR^ 1 BI^e H Tn K p9rTp!> T c 0 - F ':f , !? P0WENT 0F T4E N-EMD RULE 

THAT are destabilizing 4«'N0-ter«inal. residues 

■ BIND TO OTHERWISE I DENT I c Atj L pp . pi — E „' BUT DBES WT 
TERMINAL RESIDUES ‘ o ! Ai:' 1 1_ 12 1NG AMINO- 

E M B L J X 5 3 7 4 7 5 0 U B R 1 r.; ' 

PIR ’ S12332 i B12 3 3 2. 

U BI Q UI T IM C 0 N J U G A T I 0 N , 

1 1 1 - .. . ... 


SEQUENCE 1950 AA» 2S4336 MW? 1 , 3334 ,-, 3 * 


• ft e* t f~* i 0 v 0 d b Lj 

I n i t. zap [j q- jp j-- 0 
du* 

a *3 p S 


5Cv;l/ 


T hi 


CN : 


< e s 1 ci u a j d r f *t i t u 


5 Optitfi :vd Sc or* 
00"; Match- * 


! ra-:,DH 

3i ‘juj T j. car:re =. 0,00 


l i i.,o 


•** 1 1 A L A 0 C V r 0 i-.f 0 H : * : > 
.T.'-.O 74*: 





Best Available 6opy ■* 

VLNAT 

;r CLN,P 1?r TTFftNM I^ TLC8E i^H«‘>«T ( » V v E KyPSNK F n I! N DP v R y II , t s i 

440 450 460 

GHHKILPESSTHSLSPLINDVETPTQrtv^mtr. fli.-, 

470 430 4^0 YWKrtLRKDIGNVI IPTLA 


j 1 0 




20 


530 


5. CELSA-14 (1-5) 

RR “ GNV RNA-directed RNA poly„ eras * 


hi 


ENTRY 

TITLE 


arcissus mosaic vir 


RRWGNV #Type Protein 

RNA^directed RNA polymerase - Nereis sms B nt a ir vi - 

Jtrew.ATt-NA^ r n :^~' s /- 7 - 7 -« ” 

3 §6??r l,, §.; B,,u :?r 30 r?s p: ‘”fo* T '* 1 3 °- sep - 199 

narcis sus mossic virus 
JT0470 


DATE ' 
PLACEMENT 
SDURCE 
ACCESSION 
REFERENCE 
^Authors 


#Journal 
# T i 11 e 


7u.iderne D. ? Linthorcf m i m u ■ 

C.J.r Sol J„F J,Ml ' Huisma " M.J. r A. s J e s 


0 . Gen. Virol. < 1 99 i 7 0 - in o-.. 

N UC L *=* O t. i d !=• ?pnu nrr, - 

♦ Reference-number JT0470 " " ;tar ' ci55u5 mosaic virus RNA = 

♦Accession JT0470 

#Molecu 1 e-type genomic RNA 

♦Residues 1-1643 <ZUI> 

WCr o 5 s-reference GB;D00405 
COMMENT Thi - vi 

SUPERFfiMILV sr- 

keywords r ,, r!( ,: t ; , .■ _ 

SUMMARY -»• m ~ --'tidyltra n s F a r a 5 e 

•>i M l* 1 0 c u l j*j r ■■ iji p i n h f 1 0 s"? r» hi 

SEQUENCE ~ 3ht 1 w63 " , ° -^Length 1643 «,, m ° 0 ' 

COMMENT Retried b-i T . - 

- J dlq? ' ;k ,jr * fnu L5 Feb 93 10 , * 9 : no_p„ T 


I n iti-al Score .= ^ n . . 

r *=, - i ra. i - 1^4-, J Optimized Soar* 

^entity - 100X Matches " ‘ . , 

J ® P ® — n r _ . . 'J M i a m 3t, c h * s 

0 •- uhservalive Substitutions 


5 Significano 


VTr rVLLRNDWQTKLPI L PADVFK TFFKSV I Q P r*,o T , . r .. 

900 910 9 pn L Y T * L l " FHa ' L I«VVMHHQNVVF 

/£ - U 730 940 950 

XX 

YLNAT 

SVYHLTMPEAY I AALPEAVF I F ^PYf -r v^j ■! h . 

’70 , SB - 9 ; 0 ■ - ri'o q oo'' K '^ UL f q, ; : ,V 3 VVEEr 1 ';^ KV;jr « HH ^ 

L 1 J i 0 d 0 \ Q 3: 


PSTMKRNAMFDMGHWFMT YA 1_ r.yi tat-~ 

1040 ,osn —0«uT f AP nlulLLl)N HTQFCSERVLVl 


1 050 


l 060 


1 0 7 0 


i HA 


u 6 o 


ILERAVDRIHFfH 


CELSA-14 (t-51 

VOR1 NM0 ISA KD racTEFn .; U * h - , ; , 

ID DOR! MMV STANDAK0S RRT 

Ac r-ioOPS; ■'' 

DT 01-AFR-1970 <* RFi . a rrc-.T.— 

^ -- - h ? ' I f" i J ) 

1 7 L. . t V ■ i AS • r ; t.-t, r ‘ - r- . . r .. 

*: i " ■ - A* m‘, 1 ‘:j ( j ‘ ' 4 ’ ! ' c l ' L 


I ■*>. i. 


‘ J 


"i* V ; - 


LADGNQIPL 


fig FastDB. 

0 . 00 
0 
0 

LTGBNRQ 

0 


SRIPMLV 





RN 

r; 1 1 




BestA' 

RP 

SEGUE 

NCE 

FRO 

M N. 

A „ 


RM 

3 927 9 

206 





RA 

Z U IDE 

MA D 

« 7 

LINT 

HO 

RST 

RL 

•J. GE 

N. VIR 0 

L. 7 

0 ; 

267 

C C 

- ! - F 

UNCT 

ION 

: RN 

n *“ 

REP 

CC 

P 

OSS I 

ELY 

FUN 

C V 

ION 

DR 

EMBL : 

D 0 0 4 0 5 

? NMV. 


DR 

' P I R f 

JT04 

70; 

UT 047 

0 . 

KW 

NUCLE 

0TID 

E-E 

I ND I 

NG 

? H! 

SO 

SEQUENCE 

J, 

643 ' 

A A 


CC 

-!- R. 

etri 

P V 

d by 

a 

I e v | 

n i t. 

1 a 1 S c 01 




1 

5 ( 


BesYAVaMie t&p'y 1R u s E S; P i 


OTEX VIR I DAL.. 


Residue Identity 

U 3 p 5 


.1 0 0 7 m Matches - 

0 Conservative Substi'tut 


SigniFicance 
M ismatehas 


V T PTVLL RN DWGTKLPILPADVrKTFEKSV I8 PCNP I LVFDDyTKLPP £ UE SWM HH 8N VVPULT G DN I v a 

v - j0 940 950 960 

A X 
YLNAT 

SVVHETNPEAyiA.LPEAVEIFSPVCE.^tUjHRfWK^^Kt^VSEREGKtKVN^SHHLK.SR.^.V 

970 A 10u0 1010 1020 1030 
PSTMKRNAMFDMGHHSMTYAGCQGL T APK I 0 I! i ,, 

1040 1050 i.060 Vo-il a* ^ ' 1U SRAVDRIHFIN 

* ~ * kj j 0 3 ' w < | o f ? Q 


7 * CELSA-14 (1-5) 

WMWGPV RNA-tfir»ct.a RNA pot MB .r.„ - Potato virus , (str , 

TITLE HMUGPV #Type Protein 

RNA-di rected RNA polymerase - Potato V in- v • i . 
*3) #EC-number 2.7.7 4 R n *' £ui : 

ALTtrfNATE-NAME RNA replicas ' 

DATE “7 n I 

placement ! SfqUm r 30-Jun-1990 »T„*l 30-8*p-l», 

SOURCE p ot'io viru-i X, PVX ‘ "° 

ACCESSION JA0I02*"" t * Bac ' ,B cv ‘ Sansun #Connon-na*» tobacco 


DATE 

PLACEMENT 

SOURCE 

HOST 

ACCESSION 
REFERENCE 
)i : A u t- h o r s 


° r 5 Huismar. M.J., Li n t h o r s t H . j . m . , B c , , xJ , r> , 

Cornsli ? 5 P(-| e .i r 

trit?: al ttJ™-.? 1 /* 1 ' '“***>. 6 9 ; 1 7S9 - 1 798 

v. u m p l e t e n u c l e o 1 1 d e s e q s. i e n c e o f p , t. i 

a n d 11 s h o m o 1 o g i e s a t t h e a m i n o 3 c .■ d " 1 - 

it P „ j,. t . . v ar 1 0u 5 F 111J 5 5 *-!- a n ded RNA v iru *=, T 

*i.ef erence-number JA0102 " 

# A r; c e s s i o n J A 01 0 2 

f M olec.ule- 1 ype mRNA 
4 Resi d u e s 1 - .1 4 5 6 < H U I > 

*Cr c 5 s-ref erenc * GB : M3154J 
COMMENT Th i - • -1 1-1 • r- 

SUPER’F am 11 Y Im-!. y : ’/t 3 of 1 F'o ue ■: vr., ; 

Kcvurov' polymerase 

.nucleotidyltransferase 

0 1 " Hi ' ' " no i. ec1 ar-we i nht jV-'w- t,| . . , 

SEQUENCE J ~ L “ V - W ■•'-vr-gcr, 14Co *c ner ••• 

COMMENT Ret.ri ».,*,« v„. . . 

- ■"-> - I r.u dt- f-'c-P 3 I 


:■ \ J r u 5 X 

7 f < 1 m ; r h 


- M 



VLP 


T N ELRLD« 8 KK W NTi«MHfWl»fi?re e TGS IV.FDDYSKLPPGYIEALVCFYSKTKLIILTGDSRQ 

7,0 780 790 300 310 ssn 


S20 


330 


X X 
YLNAT 
1 ! I I i 

3VYHETAEDAS 1 RHLGPATEYFSK YCR YYLNATHRNKKDLANMLGVYSERTGVTE1SMSAEFi EC I PTL Vp 
'-'AO 850 860. X 370 


900 


0 E KRKLYMGTGRNDTFT Y A G C 0 GLTK PK V GIVL D H N T Qy C S A N V M YT A LSRATDRT H FV N T 
93.0 920 930 940 


950 


960 


8. CELBA-14 (1-5) 

b 14005 ^Hypothetical protein, 166K - Potato virus X 


ENTRY 

TITLE 

DATE 

PLACEMENT 
COMMENT 
SOURCE 
REFERENCE 
# A u t h o r & 


S 1 4 0 0 5 #Type pp o tein 

*Hypoiheti c a l prc*Uin, 166K - Potato virus X 

1 6-Apr- 1 992 ^Sequence 16--Apr-1992 JtText .1.6 - Anr - 1 99 ? 
0-0 0.0 0.0 0.0 0.0 
*This entry is not verified. 

Pot a t o virus X , p V X 


# J o u r n a 1 
tt Title 


Orman B.E., Celnik R.M., Handel A.H., Torres H.N., 

Mentaberry A.N. 

Virus Res. (1990) 16.’293-306 

Complete cDNA sequence of a South American isolate 
o f p o t a t o v i r u s X . 
tvR ef erence-number SI 4005 
^Accession S14005 
ftCross-reference EMBL:X55802 

SEQUENCE ^olecular-ueight 165300 ^Length 1456 Checksum 9959 

C ° MriENT Retrieved by alexk on Thu 25 Feb 93 1 0 ; 29 : 02-PS T using Fas t D B 

Initial Score = 5 Optimised Score = 5 Siqni firanr a 

Residue Identity = 1007. Matches = ^ mi 1 

Gaps = ■ n r ■ . n,. snatches 

h ® L u i- (3 e r v a t x v e b u b 5 t i t u t i o r s 


0.00 
0 


- 0 

VLPTNELRLDWSKKVPNTEPYMFKTYEKAL I G'GTGSI V IFDDYSKLPPGYI EALVSFST K I KLIILT G D S R 0 
"° 730 790 SO® 810 820 830 

X X 
YLNAT 

3 V Y H E T S D D AS i RH L GP A I'E VF A R YCR Y Y L NATHRMKKDL A NMLG V YS ER TGTTE I S M3SEFLEG VPt L VP«» 
^ 40 So ' J 860 X S70. 330 390 900 

D E K R R L Y M G T G R N D T F T Y A G C S G L T K P K V Q I V L D H N T Q V C 3 A N V H Y S A L S R A T D RIH F IN T 
9i0 930 930 940 950 960 


9. CELSA-14 ( 1 - 5 ) 

V 0 R 1 _.p V X 165 K D P R 0 T E I N ( 0 R F 1 ) . 


ID 

V0R.1 _PVX 

A C 

p 0 R 3 v 1 ~j ; 

DT 

' 01 -MAR- 1939 

DT 

01 - MAR-19 39 

DT 

01 - AUG-i 992 

DE 

165 KD PPOTF. 


STANDARD; 


P R T v 


1;3d AAt 


LAST ANNOTATION UP 0 ATP) 
F i ) J 



nn 

RA 

RA 

RL 

CC 

CC 

DR 

KW 

FT 

SQ 

CC 


SKRYABIN K.C., K^^ a . b , le g8^70V S.V, ROZANOV *. N . , CHERNOV 

L U K ASHE VA L.I, , A TA B E K0 V J. G. ? 

NUCLEIC ACIDS RES. 16s10929-10930<1983). 

- ! F U N C TI 0 N 5 R M A R E P LI C A T I Q N . T H E C E N T R A L P A R T 0 F T H I H p R 0 J EI Pi 

POSSIBLY FUNCTIONS A3 A NTP-BINDING HELI CASE. 

EMBL; X05198 ? P0PVX3. 

NUCLEOTIDE-BINDING? HELI CASE? RNA REPLICATION. 

NP_BIND 735 742 POTENTIAL. 

SEQUENCE 1456 AA ? 165406 MW ? 1.09471E + 07 CNJ 

- !Retrieved by alexk on Thu 25 Feb 93 1 0 : 29 H12-P5T using FsstDB 


Initial Sc o r e 
Residue Identity 
G aps 


5 Optimized Scor- 
1007. Matches 


5 S i q n i f ica n c e - 0.00 

“ 5 M i s m a t c h e s = 

0 C o n s e r v a i i v e S u b s t i t u t i o n $ ™ 


VLPTNELRLDWSKKVPNTEPYMFKTYEKALIGG1GSI V IFDDYSKLPPGYIEALICFYSKIKLV ILTGD3RQ 
770 730 790 800 810 320 830 

X X 
YLNAT 

. , . _ INI! 

SV YHt. i ALDAS I RHLGPATEYFSKYCRY YLNATHRNKKDL ANMLGVY8ERTGV TE 1SMSAEFLEGIPTLVFS 

870 


850 860 X 870 880 890 900 


D E n R R L Y M G T G R N D T F T Y A G C Q G L T K P K V 01 V L D H N T Q V C 3 A N V M Y T A L S R A T D R I H F V M T 
910 9 20 930 . 940 950 960 


10 . CELSA-14 (1-5) 



V 0 R 1 _ P V X C P 

165 KD PROTEIN 

(ORF 1). 


ID 

V(JR 1 _PVX CP 

STANDARD J 

PRT; i 

456 AA. 

AC 

P 2 2 5 9 1 : 




DT 

0 1 -AUG-1991 

(REL. 19 * CREA 

TED) 


DT 

01-AUG-199I 

(PEL. 19.. LAST 

SEQUENCE U 

PDATE) 

DT 

01-AUG-1992 

(REL. 23. LAST 

ANNOTATION 

UPDATE 


DE 

OS 

DC 

RN 

RP 

RM 

RA 

RL 

CC 

c c 

DR 
DR 
K W 
FT 
SQ 
f: c 


165 KD PROTEIN ■0RF i) „ 

POTATO VIRUS X (STRAIN CP) (PVX), 

VIRIDAE; SS-RNA NONENVELQPED VIRUSES? PGTE X V IRIDAF . 
l 1 ] 

SEQUENCE FROM N » A . 

90364 7 7 2 

0 R M A N B .. E . r C E L N I K R . M ■ ? M A N D E L A . rl< . T 0 R R E S H „ N » M F N T A & E R R V 
VIRUS RES„ 16?293-306 ( 1990) - 

rUMCI ION < RNA-REPL. I CAT I ON „ THE CENTRAL PART np THIS P R 0 T E J 
POSSIBLY FUNCTIONS AS A NTP-BINDING HF! TCA C T, ' * 

EMBL? X 55S 02 ? FDPQVX. 

FIR? Si 4005? S14005. 

14 U ! -<■ L tz. G T I D E - B j. M 0 I N G ? HEL I C A S E » P* M A R E P L I C. A T T 0 N 
NP_B1ND 735 742 POTENTIAL, 

S' E Q U E M C E 14 5 6 A A ? 1 6 53 01 N W ? ;l , 0 ^ 4 6 2 1 E 0 7 C U ; 


A . N i 
N 


Retrieved by a I e x k cn 


r r: u 






I ri i i i a 1 3 c o * e 
Residue I d e le L i L1 1 
Gaps 


5 Optimized Score. 
100% Mai.- hos 

0 Corea? , e ! -j V -- 'C! . ; 


b ’! o ‘J 1 Tic 3K- 
M i s r. a bo >ve 


0,00 


V L P f M E L R L D W R K K v r HIE P Y M F ;■; T Y L K A l i G G ! G •] v 
770 too 7*0 




. o -v o 


o o 





D E K ft ft L V l v { G T G 


CRNBTFT^«e 9 «W|fiR tl)HNT8UW4NV „ v , 


' * ; - r - ■! H P .• t i i 


il. CLLSA-. 1 .4 il~5) 

V0R1.PVXX3 “,165 KD PROTEIN (ORF 1 ) 


ID V0R1_PVXX3 


standard; 


AC pi 7779 ; .“ F F i ; i 4 5 6 A A . 

DT 01 -AUG-1??0 (REL. 15, CREATED) 

DT ol-AUG-i^P !pp, L ’ : LA !I SE9 y. E N<IE UPDATE) 

DE 165 KD PROTEIN"<nPF W ft" A -" ANNJfATI0N 0 PDATE) 

OS POTATO VIRUS X (STRAIN*XT) rpun 

rn c’!i 1DAE; SS ~ RNA ndnen veloped viruses; potexviridae. 

RP SEQUENCE FROM N.A. 

RM 83099944 

RL j:’gen N viR0L. L l9U7S^i?9BMi(- BL J-F " C0RNELIS5EN 8.J.C. 

CC P0SslB°v ! F ^U0NS I AS T «°";I TJ "i.^ ( ';3 RAL ^ ° F THJ3 ™°T 

DR EHBL, D00344; PVXX3. •<T,-#I»D 1 N« HELICA3E. 

DR P IRi J A 01 0 2 ; WMWGPV 

- ?S ,0E -“r 51 "f— 

B 4 ? E “- " »F. 2 Et o 7rN . 

■ by ,X ** k Thu 25 ?3 lD. 8 »,5*: m using F 


Initial Score 
Residue Identity 
Gaps 


astDi 


5 Optimized Sr C re 
100 -/. Matches = * 

0 Conservative Substitutions 


big r* if icanc 

N i s m a t c h e s 


VLF TNELRuDWSKK VPNTEP YMFK T YEKAL I r orr-r- r •; t c:-- ~ - 

770 7o 0 ; o ); ,J,J ’ ,J ^ lFuu r °*L ? R0YIEAI.VCFYSK1 K!. 

' •• J 800 810 sao 

X X 
ylnat 

SVYHETAEDASIRHLGPATEYFSKYrpyv' 1 ii THQ\i~r~. i-.. , 

840 850 •'-•;-^-^rH h NnKuLANMLGVYSERTGVTEISMSAEFL 

wuU A 8/0 380 S90 

DEKRKLYMGTGRNDTFTYAGCQGLTKPKV Ql V! DHNTO-•'qs V „ M „ TMr 

910 9p 0 • -J Wt.LHi.jTi3,.„bANyMt TALSRATDRIHFVNT 

’ ^ 940 950 960 


0,00 
0 


1ILTGDSRQ 


.EG IPTLVPS 
900 


1 ^ • C E L S A — 1 4 ( 

VGIHJ 2 

ENTRY 

TITLE 

ALT E R N A T' E ~ N A M l 
INCLUDES 
DATE 

placement 
SOURCE 
ACCESSION 
R E FERE i‘T C E 
#Authors 
ft J '.'i urn a I 


E 2 glycoprotein precursor 

VGINJS #Type Protein 

E2 glycoprotein prec u t■ = or 
‘ 5 1- r a i n ui id t y pe M j-,; y - q 
P e p 1 o r, e r g i ,j c o p r o !. e i n \ r p 
7 0 B '3 1 y o P!- o t e i n \ 9 0 A a i. u . 
J 1 - M a r -1 9 9 1 # S *. q u e n c e 3 i ? 

L.: d 7 : l .0 4 B o t ;-j --J 

lO L-S r - l r': •=- h'r'T- t 1 1,V t , * ■ > - > - . 


i r*i t ~* ft e- f j t i t -j 


i r ’ ! h e p 3 t i t i 





.» ul u t r, 3 x UYl 4 „ „ . . _ 

#Ho 1 ecu l e-type genJM A ^ ble Co PV 
#Residues 1 - 137 6 <pAR> 


!h:i 5 virus is a nenber of' the f ami lu 
^Narie coronavirus E 2 glycoprotein 
g l y c o p r o t e i n \ m e m b r a n e p r o t e i n 


j r< 3 v i p i d a e 


COMMENT 
SUPERFAM1LY 
KEYWORDS 
FEATURE 
1-14 
15-1376 
15-76? 

770-1376 
1321 - 1338 

31 , 60 , 134 , 192 , 357,4 35 , 

442,582,677,709,7.1 7 , 

740,789,806,896,945, 

1 1 7 8 , i 2 3 2 , 1 2 4 2 , 1 2 6 1 , 

12/7,1298,1370 #Bir.ding-s i te carbohydrate <Asn> 

ico v a \ e n t} ( d r ^ d i ri.prO 

SEQUENCE #M ° leCUlar . Uei9ht 151381 length 1376' Checksum 4481 

COMMENT Retrieved by ale$:k on Thu 25 Feb 93 10:29:02-PST using FastDB 


vi : D o f r ( a i n s i g n a 1 % e g u e r- c e < S I Q > \ 

# P r o t e i n E 2 g I y c o p r o t e i n < E 2 Q > \ 

# P r o i e i n 9 0 B g I y c: o p r o t e i n < E & B > \ 

# P r o t e i n 9 0 A g1 y c op r o tei n ■ ’ E G A >\ 

#I)onai n t r’ansnenbrane < ThiN> \ 


Initial Sc o r e 
Residue I dentity 
Gaps 


, 5 Optimized Score = 5 Sigraficar.ee = 

10u/. Matches - 5 Mismatches 

0 Conservative Substitutions 


0,00 
o 
0 


X x 
YLNAT 

MLrVr- ILLLP3CLGYIGDFRCIQTVNYNGNNASAP3ISTEAVDVSKGLGTYYVL nFVYLNATi ! 

10 20 30 40 50 60 X" 


LTGYYPVD 

70 


ob vi V rtNLALTGTMTLSLTWFKPPFL3EFNDGIFAKVQNL KTNTPTGATSYFPT I V 1 GSLF. 7 N J sY T WI EPv 

90 100 1 10 120 130 ‘ . 


140 


NNIIMASVCTYTICQLPY 
150 160 


13 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

RN 


CELSA-14 (1-5) 

V _ C V! 14 Ed uLYuuPROTE IN PRECURSOR (SPIKE GLYCOPROTEIN) (PE 

V G L 2 _ C V M 4 STANDARD; PRT; 1376 AA 

P 2 2 4 3 2 ; w 

01-AUG-1991 (REL. 19, CREATED) 

0 1 - AUG - 1 9 ? 1 ( F: E L . 19, LAST SEQUE N CE UPDATE ) 

°t~ f ? AY I 1 ? 93 , (REL ■ 2 2 ’ LAST ANNOTATION UPDATE) 

E 2 bi_ V COP ROTE IN PRECURSOR (SPIKE GLYCOPROTEIN) (PEPLOMER PROTEIN) . 

MURINE CORONAVIRUS MHV (STRAIN WILD TyP F 4 ) ^i H y- 4 ' 

viridae; SS-RMA ENVELOPED VIRUSES-' POSITIVE—STRAND J* COR ONAVIRIDAE. 


RP 

SEQUENCE FRO M 

M . A . 

RM 

90085315 


RA 

P A R K E R S n E , , G 

ALLAGHER T 

RL 

VIROLOGY 173:6 

64-673(198 

C c 

- !- FUNCTION: 

THE PEPLOM 

cc 

TO THE NOS 

T CELL REC 

C C 

-i- SIMILARITY 

= NEARLY I 

c c 

AND MVK--A5 

9 STRAINS, 

cc 

- ! - SUBCELLOLA 

R LOCATION 

DR 

EliBl.M3 2789: 

MHVE8GLY . 

DR 

PI R A 3 3 7 4 S : V 

G ! Li -J 2 . 

KW 

! > - ■ *0 P K 1 J T E T ,'-j* 

P ; 1 \/ { - ‘ ■ , J - <™ "T* 

P T 

’• 'TN A!. 



5 ! HE US 1 MD 1 G fjF V J R [ 0N3 

■‘ED IN MEMBRANE FUSION. 

G L Y < 10 r P 0 7 E I N 8 F R 0 f-| f-• V f i - J H M 


O > 



FT CHAIN 
FT DOMAIN 
FT TRANSMEM 
FT DOMAIN 
FT DOMAIN 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
S3 SEQUENCE 
C C - ! - Petrie-. 

Initial Score 

R&s idue I denC i tij 

Gaps 


.i. 

7 70 
10 
1321 
1339 
42? 
31 


Be^t^/jjailable Copjj£ *^ p 

1320 


DTE IN 


1333 

1376 

599 

31 


60 

60 

POTENTIAL. 

1 3 4 

1 34 

POTENTIAL. 

1 92 

192 

POTENTIAL. 

357 

357 

POTENTIAL.. 

4 35 

435 

POTENTIAL. 

5 32 

5S2 

POTENTIAL. 

6 77 

67 7 

POTENTIAL. 

709 

709 

POTENTIAL. 

7 17 

717 

POTENTIAL. 

7 40 

7 40 

POTENTIAL. 

7,39 

7 8 9 

POTENTIAL. 

3 06 

806 

POTENTIAL. 

945 

94 5 

POTENTIAL. 

1232 

1232 

POTENTIAL. 

1 242 

1242 

POTENTIAL. 

1261 

1261 

POTENTIAL . 

127 7 

1277 

POTENTIAL. 

1293 

1298 

P OTENTIAL, 

1370 

1370 

POTENTIAL. 

13 76 

AA; 15 

13 82 MN; 9 a 26615 

ed by 

a 1 & y.. k 

on Thu 25 Feb 93 


SI (9OB) 

U F E I N S 2 ( 9 0 A) „ 
EXTRACELLULAR (POTENTIAL ). 
POTENTIAL. 

CYTOFLASMIC (POTENT IAL) . 

I rlPOR TANT FOR Jh £ jvjpyp q y | pyj_ p»g r p 
POTENTIAL, 


CN ; 

10i 2 ?;02-PST 


ci 0pt i ci i z ed Sc or e 
1007. Matches 

\) C o n s e r v a t i \ 


u s :i n g F a s t D B 
0 


S u b s t i t u t i a n s 


•j i gn I t i c: anc e 
M .i snatc he s 


v/ 

0 

n 


M L r V F I L L L P S C L G Y I G D F R CIST V N Y N G N NASAPSIS T E A V D V S K G 1 . G T V V V» n R * 
L d0 30 . 40 50 


X X 
YLNAT 

mu 


> R v Y L N A. T L L L T G Y Y P V D 

GSNYRNLALTGTNTLSL TiYFKPPFLSEFNDG I FFKYQNLKTNTPTGATSYFPTIV J GSL.FGNTSYTVVLFFY 

ilC 120 130 


140 


NNIIMASVCTYTICQLPY 
150 160 


14. CELSA-14 (1-5) 


VGIH59 

ENTRY 

TITLE 

ALTERNATE-NAME 

DATE 

PLACEMENT 
SOURCE 
A C C E S 3 I 0 N 
REFERENCE 

X A U t h D r- 5 


kJcUl ;*i -i* I 
f! T :f,;G 


E 2 glycopi-otei 
VGIH59 


J-ein precursor - Murine hepatitis virus 
Protein 

~ H p, tc.ui s.-.r - Murine henaUi y 

(strain A59) -- TltUa 

pep 1 omer glycoprotein spike q I u c n p r -r r . 

r! 1 1 *' 1 7 8 7 w 3 G " q ^ © I"' C © 3 1 - M a r - 1 9 S 9 ft T e- t 1 " - *.... , n „ 

2231.0 4.0 1.0 i.o t o “ "** H 

n u r i n e h e p a t i t i s v i r u s ? M H v 

A27402 


Luytjes W . „ Sturnan L . 5 . „ Bredenbeek . J . 
'- J ■ 1 v a 9r* Z e i * ~ 7. r a m u^ 

W, J . M, . ‘ ' l "*-* ! - 


: J i 


* ? o 7 ; 1 O 1 J 4 7 - A y 

1 ‘ v 1 ;1 •' T • ■ C' *1 t _ J "• 7; ry ( ■ 


h o t 1 1- cr 






#Ku- r .u 1 e-1 yp gl nl§?‘c A '® 1 ? 1ble C ° Py 

.u. r *. * 


# N a n e c o r o n a v i r u & 
glycoprotein 


# R e i d u e s i ~ 13 2 4 < L U Y > 

COMMENT This virt 

SUPERFAMILY INsrp rr,r 

KEYWORDS qlucopmt 

FEATURE 
1-3.6 
17-1324 
17-717 
718-1324 

31 r60.192,247,357,435, 

442 , 530,625,657,665, 
638,737,754,244,393, 

1126, 3 ISO,1 190,1209, 
1225,1246,1318 


is a n©n b e r of th• 


■ ^ g t ye opr ot e i n 


o r • o r a v i r- i h a e ^ 


■if* D <j n r 3 i n s x cj n a 1 s © q i_i © ri f- 0 <■ j jv 7 \ 

*Frotein E2 g1ycop r otein <E2G>\ 

# P r o t e i n 9 0 B g I u c o p r o t, e i n < E G B > \ 
#Prote in 90A g l ycoprotei. n i EGA>\ 


J ^ 4 * ’ 1 31 s # B i n d i r. g - site c a r b o h y d r ate (As n > 

PA , . ' c o v a 1 e n t ) (predicted) \ 

r-, ,K 1M trS,, C> , ^Domain tr an snenbr are <TMN> 

SEQUENCE >*"»'•■=“> ••'-Wight 145963 .Length ,324 .Ch.cksun 7300 

COMMENT Retrieved by alexk on Thu 25 Feb 93 J0:2902-PST using F ast 0B 


I n i t i a 1 Score 

Residue Id en tiiy 

G a p 5 


5 Optimized Score = ■ 5 

10 0 Vn Matches ir 

0 Cort serv a t i v© 5 u b s t i t u t i on s 


5 Significanc© = 0.00 

h hi i s natch© s ~ 0 

r s q 


YLNAT 

MLFVFI LFLPSCLG'Y I GDFRCI QLVNSNGANVSAPSISTETVEVSQGLGTY YVLDRVYLNATLLi TGYYPVD 

l ’° 30 40 50 60 X " 70 

GSKFRNLALTGTNSUSLSWFQPP YLNQFNDGI FAKVQNLKTSTP3GAT AYFPTI VI GSLFGYTSYTVV IEPY 

lL ‘ J 1 10 120 130 14Q 

NGVIMASVCQYTICQLPY 
1 o 0 16 0 


CEL3A-14 (1-5) 

VGL2.CVI1.5 E2 GLYCOPROTEIN PRECURSOR (SPIKE GLYCOPROTEIN) 


vgl2_cvma: 

PI 1224 ? 


STANDARD; 


PRT; 1324 


01.-JUL-19S9 (REL. 11 , CREATED) 

! R ! L ’ 11 ' LAST SEQUENCE UPDATE) 

v-1 M-/-19 76 tRtL. d'd., LAST ANNOTATION UPDATE) 
L i_ b ■_ / LOP ROT t, 1 r-l R E C U R S 0 R < S P 1 K E G L Y C 0 p R 0 T E I h 


M) (PEPLONER PROTEIN: 


1 r I V t - S T R A N D ; 0 0 R 0 N A V I RID A E . 


M U RIN E L 0 R 0 N A V IR U S M H V (STRAIN A 5 ? ) 

Y I rt I DAE ; SS-RNA EN VEL 0 PED V I RU3E 3 ,' PQS 1 TI VE-STR AND : 00 R 0 NA 

SE QUE NC E FROM N , A . 

83072088 

LUYTUES W „, 3TURMAN L.3., BREDENBEEK P.L, oharitf , . 

u?lrnrV^ JST H0R2IN£K M.C.. SPAAN ~W h. M, ‘ ' 

V { ii 0 L 0 i_T V 1615 4 7 9 *™ 4 3 7 *; 1 9 3 7 ) 

FUNCTION; THE PEPLOMER PROTEIN MFniATFP r:-jr , M r.TN,r nr- 
H0S I ( --Ei L RECEPTOR AND IS I M V 0 L v E D " I N ^ M F. M S R A N E' ; 

''’' t: ■■ L u L A H L 0 A T 10 N; p e r r p h e r a l. m e m r r a n p r, i y ■' .•> p ^ ,•> T F ’ 7 m 

L. M B L ; M 18 3 7 9 ? r 0 R M H V £ 2 , ~ ' ' r 1 J 

P I R ; A 2 7 4 0 2 ; V G I i-| 5 9 . 

Gi Y C 0 P R 0T BIN) E N v k L 0 PL PR 0 f E 1 ;M , f R a nS! E MR A a 


1 




FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

FT 

SQ 

CC 


i i ri i ! t 

DOHA IN 

TRANSHEM 

DOMAIN 

CARBOHYD 

CARBOHYD 

CARDOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

SEQUENCE 

-!- Retrii 


J7 B^pihWCopf^^cPROT 


/ .1 

/ 

1 2 6 £> 
128 7 
3.1 
60 
192 
357 
, 435 
530 
625 

657 
6 65 

658 
737 
754 
893 

i 180 
1190 
1209 
1225 
1246 
1318 
13 24 
ved by 


EI N S 2 ( 9 0 A ) > 
L.AR (POTENTIAL 


1324 
31 
60 
i 92 
357 
435 
530 
625 
657 
665 
688 
737 
754 
893 
1 180 
1 1 90 
1 209 
1225 
1246 
1313 

AA? 145964 
ci 1 © U o n 


C Y 7' 0 P L A S M I C 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 

P 0 T E N T I A L , 
POTENTIAL. 
POTENTIAL. 
POTENTIAL.. 
POTENTIAL, 
POTENTIAL. 
POTENTIAL. 
POTENTIAL.. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
POTENTIAL. 
MW? 9039956 
i hu 25 Po L 93 


(POTENTIAL). 


CN 7 

1 0 ■’ 29 ; 02- 


using FastDB 


Initial Scon© 
Residue Identify 
Caps 


5 Optimized Score = 5 

100 Vm Mate h e s = tr 

0 Conservative Sub5 1 itutions 


Significance - 
M 1 s Fi aiches = 


0.00 
0 
n 


X X 
YLWAT 

MLFVFILFLPSCLGYIGDFRCIQLVNSNGANVSAPSISTETVEVSQGLGTYYVLDRVYLNATLLLTCYYPVD 

C ° 30 40 50 60 X 70 

G S K F R N L A L T Q T N S V S L B w F" Q P P Y L N Q F N D G I F A K V 3 N L K T S T P S G A T A Y F P T I V I G SLFGYTSYTVVIEP Y 

>u 100 iiO 120 130 140 

NGVIMASVCQY7'ICQLPY 
150 160 


16. C E L S A-14 (1 -5) 


VGIHMJ 

ENTRY 

TITLE 

ALTERNATE-NAME 

INCLUDES 

DATE 


l-. 2 g 1 u c o p r o t e i n p r e c u r s o r - M u r i n © h e p a t ,i t i s v i r u s 
9 *j .[ H M U T y e F' r o t e i n 

E2 glycoprotein precursor- - Murine hepatitis virus 
( s 1 r a 1 n J H M) 

p s P 1 0 n e r " 9 1 y c o P r- o t e i n \ - s p x k e ' g 1 y c o p r o t e i r, 


PLACEMENT 

2 2 3 1 . 0 4 

SOURCE 

f: u r i n e h pi a 

ACCESSION 

A 3 3 0 9 o 

REFERENCE 


Y A u 1, h o r s 

Schmidt I . , 

# J o u r n a I 

0- Gen., Vir 

Y T i 11 © 

Nuc 1 eot i :J^ 


project i 

I- R ■© f e r e n c e - 

A33093 

®Acce -is i on 

A 3 . n 9 d 

■;{; 1 1 1 t ~ ; 



-; i P Atomic 


' : ■. j * l’ . . • 


90B: glyeeproteiri\ 90A g 1 ycopf ote 1 n 

"7 1 _tvt - * n r~, 4 nm 

sequence* 31 99| $Texi 


1 - 0 2 « 0 1 , 0 
v i r-u s 7 . N H v 


-'1 « v 9 1 d d £• \ 1 

> 3 0 ; 4 7 • - ^ ~ 


30 - Sep- -1 993 




t. -jL'Op--OT * 5 - i ’ i 


U •" U ' :■’! ■. 
: t'9 1 * . . 





feature 
1-10 
1 1-1235 
1 1 -623 
629-1235 
1175-1208 

- 1 •• 60 r 1 34 1 92,357,435 
442,536,568,576,599, 
648,665,755,804 1037 , 
109.1 - 1101, 1 120- 1136, 
.22 


J J feestAvailabfetfo u py dh '“ P f -otejn 


#Do/n a i n signal sequence <SH3>\ 

# P r o t e i n E 2 g 1 y c o p r o t. e j, r. < E 2 G > \ 

# P r o t e i ft ? 0 B g 1 y c o p r o t e i n < E G P > \ 
H- F • u 1. e i n 9 0 A g 1 y r n p r o t- e i n < E G A > \ 

# D o m a i n t r a n s m e m b r a n e < T H M > \ 


SUMMARY 

SEQUENCE 

COMMENT 


4 Binding-site c. a r b o h y d r t e (A 5 n) 
(covalent) (predicted) 

u l a r - w e i g h t 136653 length 1235 tfChecksum 9154 


Initial Score 

Residue I dentity 
Gaps 


Retrieved by alexk on Thu 25 Feb 93 10;29:03-PST using FastDB 


5 Optimized Score = f; 

10 0 '/. Matches - c, 

0 Conservative Substitution: 


S i g n i f' i c 3 n c e 
M i smatches 


0.00 

0 

0 


X x 
YLNAT 

MLFVFItLUPSCLCYIOBFRCiaTVNVNGNNASAPSISTEAVeVSKGRGTVyvLDRvWwiTLLLTOVYPVD 

40 50 60 X 70 

GSNYRNL ALTGTNTLSL TWFKF'FFLSEPNDG I rAKyGNLKTNTPTGATSY.FPT J VI GSLFGNTSYTUVLEP Y 

■ lU ° 110 1P0 130 140 

NNI IMASVCTYTICQLPY 
150 160 


ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

DC 

RN 

RP 

RM 

RA 

RL 

C C 

c c 
c c 

DR 
DR 
KW 
FT 
FT 
FT 
FT 
F T 
FT 

r.: r 

F 7 


CELSA-14 (1-5) 

VGL2.CVMJH E2 GLYCOPROTEIN PRECURSOR (SPIKE GLYCOPROTEIN) (PE 
VGL2_CVMJH STANDARD; prt ; , 

P.1 1225; Art " 

01-JU L -19 s 9 < R E L . II, CREATED) 

u1~UUL-1959 (REL. 11 , LAST SEQUENCE UPDATE) 

U 22r LAST ANNOTATION UPDATE) 

St ' r ‘PECUR30R (SPIKE GLYCOPROTEIN) (PEPL0HER PROTEIN). 

MURINE C0R0NAVIRUS MHV (STRAIN JHM). 

VIRIDaE; SS-RNA ENVELOPED VIRUSER;' PORTT-u- c-t-.■ nr- r- 
[ l 3 “ t> 1 K A !<| 1 ) ; C 0 H 0 N A v I RI D A E . 

SEQUENCE FROM N.A. 

87111467 

SCHMIDT I ; , SKINNER M.A., SIDDELL S.G. : 

D - U E N .. V I R Q L - b 8 - 4 7 — 56(1 9 P. 7 ) 

• ~ r * t/ ‘ ^ C ( I 0 1'1 5 THE P E P t H i-1 F p p n T rr t m s S *•- n r *. i- - „ . . 

TO THE HOST CELL RAPTOR P' 


PIR; A33 095 ; VG.IHM 


SIGNAL 

CHAIN 

CHAIN 

CHAIN 

DOMAIN 

TRANSMEM 

dona; 

C A PR Jp 7 p 


cat i i; 
vs, 

V/ <r 

jN; PERIPHERAL HEM 

L OPE 

f R ti i r. * '■ n » j p 16 H E H 

1 0 

p o Term al . 

i fJ -j d 

SPIKE E 3 GL 

633 

RP 1 KE PPiJTE 

1 P 3 5 

7 >! ' L 1 \ h: PKOTE 

- 1 7 4 

E v TRACE Li. HI. 


t r! „ 


•: pctp, *t ; 






I n i t i 3 
Re-s i du 
(=aps 


CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

CARBOHYD 

SEGUE N C E 

- - Retrie’ 

' 1 Sc o r e 
' * Identity 


BestAvailable Copy ] ' ^ 1 £ • 

iVcf k PGTENTIAI . 


4 35 


435 


POTENTIAL. 

b J 6 


536 


POTENTIAL. 

563 


563 


POTENTIAL. 

576 


576 


POTENTIAL. 

599 


599 


POTENTIAL, 

643 


64 3 


P0T EM TIAL, 

ot>5 


66 5 


POTENTIAL. 

304 


304 


POTENTIAL. 

1091 

1091 


POTENTIAL. 

1 1 0 1 

| 

101 


POTENTIAL. 

1120 

1 

120 


POTENT IAL. 

1136 

1 

136 


POTENTIAL. 

1 157 

i 

157 


POTENTIAL. 

1229 

1 

A 

929 


POTENTIAL. 

1235 

AA x 

1 3 6 6 5 3 


M W 7 7 539 g, 

ed by 

al 

e>;k on T 

h 

u 25 Feb 93 

= 

t=- 

Dpti ni 

z 

0 d S c 0 r e •“ 

= 1 

0 0 7, 

Mate h e 

5 

— 

“ 

0 

C 0 n ser 

V 

stive S u b s t j 


. 0 : 29 ’03 


5-PST 

Sign! 

rl :i. s fi a 


u s i n g F a & t D 3 


f icanct 

tches 


0.00 


HL.F- Yr- I lLLPSCLGYI GDFRCISTVNYNGNNASAPSI S jE £ 


X X 
YLNAT 


IE A y D V3 KGR GTY Y VL D R VY LNATLlLT G Y Y P V D 
^ 5 0 AO Y -frs 


GSNYRHLALTGTNTLSLTHFKPPFLSEFNDGiFAKV.NLKTNTPTC^TSVFPT, 

“ V u inn , , „ ' •- -■ 


60 X 
L F G M T 


SYTVVLEPY 
140 


N N I I M A S V C 7' Y T I C 3 L. P Y 
t 50 1 An 


CELSA-14 (1--5) 
A 4 0 9 S 6 


ENTRY 

TITLE 

DATE 

PLACEMENT 

COMMENT 

SOURCE 

reference 

#Author 5 

# -J o u r ri 3 1 
ft Title 


#M-c adherin - Mouse (fragment) 

?; 0,8 i . » T «P» (fr, qn .„, 

* r 1 - a d r ■ e r 3 " - M o u s e ■ f r a g n o n t) 

16 A p • 1 9 9 2 : !t : S e q u e n c e 1 6 A p r - i 99 

0-0 0,0 0,0 0 a q ~ 

* 1 his e n t r y i s n o t v e r i P i e d , 

‘ ^ '** lTf ^ :: •- u u ’ -■ *'■ o r*: o • i *- n a Fi !•*> ; c rr, - 


• A d i *• - 1 9 9. 


D o r“; a 1 i e s fi „ , 


r a n a r M , R j c « y a ] ^ 


Star?, i nsk i-Powi t. £ a 

Proc„ Natl. Acad. Sc A U.S.A. Co p t 
Expression of M-c adherin, a nan bar f t 

multi gene family, correlates u1tb’d. r 
01 5 k e 1 e-1 3 1 n u 5 e 1 e r * ( | s 


r e n c e - n u n b e r A 4 0 9 3 6 
m ccec. s ion A40936 


* r 0 -■ 5 ~ r- e f e r e n c e G B - r M 7 a n a 1 

SUMMARY .. 

SEQUENCE 

COMMENT p . 

f 1 1- ■ J ■ V Cl V \ ' ! 


K3th 


^4-8083 
cadhe; in 

11 * J c) t i •*; n 


i I* J 3. 1. I a ! '„i r ;■*[ r 

O 5 i 0 U r j ,'j - :■ 





ENHKRLPYPLVQI K8DK«dMfl^^rePRNVF6I OKFIGRylilUltDR: H 1 PRFRLRA 

10 20 30 40 


DU 


60 


FAL DL 
7 0 


y L ’ STLEDPTDLEIV V V D Q N 0 M R P A F L Q D V F R G H IL E G A IP G 7 F V TRAEAT D a D D P F T D N A A L ft f- S 11 F « G =: p 
SO 90 100 no 


120 


J 30 


140 


EFFSIDEHTG 

150. 


19 * CELSA-14 (1-5) 
SAHU4F 


L e i 1 s u r f a c: e a n t1 g en 4 P 2 h & a v y cha i n 


H u m a n 


ENTRY 

TITLE 

DATE 

PLACEMENT 
SOURCE 
ACCESSION 
REFERENCE 
ft Author- s 
tJourna 7 
ft Title 


S A H U 4 F ft T y p e P r o t e i n 

'-e !. I surface antigen 4F2 heavy chain - Hunan 
30-Jun-1990 ftSequence 30-Jun-1990 ftText 30-Jun-199;: 

962.0 1.0 i,0 1.0 1.0 

H o m o s a p i e n s ft C o m m o n — n a m e m a n 
A 2 3 4 5 5\ A2S314\ A30240 


1 e i e i r a S , r D i Grand! S . r K u e h n L . C . 

0. Biol. Chen. (1987) 262:9574-9530 
Primary structure of the human 4F2 antigen heavy 
cha if.- predicts a t nans membrane prot. ein with a 
cytoplasmic NH2 terminus. 

$Reference — nunber A28455 
ftAccession A 2 8 4 5 5 

#Mo 1 ecu 1. e-type mRNA 
^Residues .1-529 <TEI> 

REFERENCE 

#Authors Quacker.bush E. , Ciabby M . , Gottesdiener K.M. » 
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Cytochrome c2 precursor - Rho 
VERY HYPOTHETICAL 16.1 KD PRO 
^Hypothetical protein 146 (ph 
CYTOCHROME B6-F COMPLEX SUBUN 
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The scores below are sorted by initial score. 
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17. 

S22345 

#Activin receptor - Hunan 

DIO 

o 

u> 

H .07 

18. 

JQ1486 

Activin receptor II precursor 

513 

6 

6 

4.69 

19. 

A37913 

#mikl protein - Yeast (Schizo 

581 

6 

6 

4.69 

20. 

SWI5_YEAST 

TRANSCRIPTIONAL FACTOR SWI5, 

709 

6 

6 

4.69 

21 . 

TWBYS5 

Transcripiiori factor SWI5 - Y 

709 

6 

6 

4.69 

22 . 

S18512 

Cell division control protein 

1035 

6 

6 

4.69 

23. 

P0LG_P0L1M 

GENOME POLYPROTEIN (COAT PROT 

2207 

6 

6 

4.69 

24. 

S09553 

Genome polyprotein - Human po 

2207 

6 

6 

4.69 

25. 

GNNY1P 

Genome po1yprotein (version 1 

2207 

6 

6 

4.69 

26. 

POLH POL1M 

GENOME POLYPROTEIN (COAT PROT 

2209 

6 

6 

4.69 

27 . 

P0LG_P0L1S 

GENOME POLYPROTEIN (COAT PROT 

2209 

6 

6 

4.69 

28. 

GNNY3P 

Genome polyprotein ~ Human po 

2209 

6 

6 

4.69 

29. 

GNNY2P 

Genome polyprotein (version 2 

2209 

6 

6 

4.69 

30. 

A371 13 

^Ryanodine receptor* cardiac 

4969 

6 

6 

4.69 

31 . 

Ft 1 0834 

Ri anodin receptor. 

4987 

6 

6 

4.69 

32. 

RYNR_HUMAN 

RYANODINE RECEPTOR, SKELETAL 

5032 

6 

6 

4.69 

33. 

A35041 

^Ryanodine receptor * Human 

5032 

6 

6 

4.69 

34. 

B35041 

^Ryanodine receptor - Rabbit 

5034 

6 

6 

4.69 

35. 

S18135 

^Calcium release channel - Pi 

5034 

6 

6 

4.69 

36. 

RYNR_RABIT 

RYANODINE RECEPTOR, SKELETAL 

5037 

6 

6 

4.69 

37. 

S04654 

Ryanodine receptor - Rabbit 

5037 

6 

6 

4.69 

38. 

R1 1510 

Ryanodine receptor deduced Fr 5072 

3 standard deviations above mean 

6 

6 

4.69 

39. 

R04972 

Papilloma virus type 16 LI pe 

20 

5 

5 

3.75 

40. 

P81545 

Human insulin acceptor protei 

33 

5 

5 

3.75 

41 . 

A36357 

*Cytochrome-c oxidase chain I 

66 

5 

5 

3.75 

42. 

NXL3 DENPO 

LONG NEUROTOXIN 3 (TOXIN VN2) 

72 

5 

5 

3.75 

43. 

NXL2_DENP0 

LONG NEUROTOXIN 2 (NEUROTOXIN 

72 

5 

5 

3.75 

44 . 

NXL1_DENP0 

LONG NEUROTOXIN 1 (NEUROTOXIN 

72 

5 

5 

3.75 

45. 

N2EP2D 

Long neurotoxin 2 - Black mam 

72 

5 

5 

3.75 

46. 

N2EP1D 

Long neurotoxin 1 - Black mam 

72 

5 

5 

3.75 

47. 

A35991 

*Retinoic acid receptor gamma 

74 

5 

5 

3.75 

48. 

THI0_BPT4 

THIOREDOXIN. 

87 

5 

5 

3.75 

49. 

G30292 

*Gene nrdC protein - Phage T4 

87 

5 

5 

3.75 

50. 

TXBPT4 

Thioredoxin - Phage T4 

87 

5 

5 

3.75 


No alignments saved. 
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0 
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0 
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0 

0 

0 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-S.res made by alexk on Thu 25 Feb 93 10;24;25-PST. 


Query sequence being compared; CELSA-S (1-11) 

Number of sequences searched; 94553 

Number of scores above cutoff? 3979 

Results of the initial comparison of CELSA-S (1-11) with? 
Data bank ; A-GeneSeq 9, all entries 
Data bank ; PIR 34? all entries 
Data bank ? Swiss-Prot 23» all entries 
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Best Available Copy 


0 --- 

II I I I I I I II I I I I I 

SCORE 0| 1 j 2 | 2 3 4| 5 | 5 6 7 

STDEV 0 1 2 3 4 5 


PARAMETERS 


Similarity matrix 

Unitary 

K-tuple 



2 

Mismatch penalty 

5 

Joining penalty 


20 

Gap penalty 

1.00 

Window size 



5 

Gap size penalty 

0.26 





Cutoff score 

0 





Randomization group 

0 





Initial scores to save 50 

A1ignments 

to save 


0 

Optimized scores to 

save 0 

Display context 


0 


SEARCH STATISTICS 




Scores; 

Mean 

Median 

Standard 

Deviat ion 


1 

3 

1.04 



Times; 

CPU 


Total Elapsed 



00;03;41.11 


oo;os;23 

.00 


Number of residues; 


25433612 




Number of sequences 

searched; 

94553 





Number of scores above cutoff; 3979 

Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 

The scores below are sorted by initial score. 

Significance is calculated based on initial score. 

A 100% identical sequence to the query sequence was not found. 

The list of best scores is; 


I nit. Opt. 


Sequence Name 

Description 

Length 

Score Scor 

e 

S i g . 

Frame 

1 . 

F AS_CHICK 

5 standard deviations 
FATTY ACID SYNTHASE <EC 2.3.1 

above mean # # # * 
2446 7 

7 

5.77 

0 

2 . 

XYCHFA 

Fatty-acid synthase - Chicken 

2446 

7 

7 

5.77 

0 

3. 

P70030 

' #«## 4 standard deviations 

Secretory signal sequence of 

above me 
26 

an #### 

6 

6 

4.81 

0 

4 . 

P70033 

Secretory signal sequence of 

39 

6 

6 

4.81 

0 

5 i 

P70029 

Secretory signal sequence of 

50 

6 

6 

4.81 

0 

6. 

S17 02 4 

Hypothetical protein (IFNI 3' 

124 

6 

6 

4.81 

0 

7 . 

YIFM_YEAST 

HYPOTHETICAL PROTEIN IN IFM1 

125 

6 

6 

4.81 

0 

S . 

S20173 

#■ H y p ot h e tic a 1 protein - Yeast 

125 

6 

6 

4.81 

0 

9. 

FMC1_ECQLI 

CFA/I FIMBRIAL SUBUNIT B PREC 

170 

6 

6 

4.81 

0 

10. 

YQECC1 

CFA1 Fimbrial protein precurs 

170 

6 

6 

4.81 

0 

1 1 . 

SSBR_EC0LI 

SINGLE-STRAND BINDING PROTEIN 

174 

6 

6 

4.81 

0 

12 . 

S S B P _ E C 0!_ I 

SINGLE-STRAND BINDING PROTEIN 

174 

6 

6 

4.81 

0 

13. 

A 3 8 4 8 7 

#He 1 i >: -de s t ab i 1 i z i ng prote i n 

174 

L 

6 

4 . 81 

0 


1 £1 . PillFr T P Ua 1 i v - r4a c + - (n i 1 i 7 i |.-,n r% 4 .-s ' — 


t -7^ 


O 1 


u 

n 





xo. icrj nuriHiM 

HMt»ULlN-LlKt GKUWIH F ACTOR B1 

291 

6 

6 

4.81 

0 

17. A34651 

♦Insulin-like growth Factor b 

291 

6 

6 

4.31 

0 

18. I0HU3 

Insulin-like growth factor bi 

291 

6 

6 

4.81 

0 

19. R05596 

Somatomedin carrier protein s 

291 

6 

6 

4.31 

0 

20. P92300 

Sequence of human insulin-lik 

291 

6 

6 

4.81 

0 

21. IBP3_RAT 

INSULIN-LIKE GROWTH FACTOR BI 

292 

6 

6 

4.31 

0 

22. A36748 

*Insulin-1ike growth factor b 

292 

6 

6 

4.31 

0 

23. S11738 

Hemagglutinin precursor - Inf 

373 

6 

7 

4.31 

0 

24. A39314 

♦Gastricsin precursor - Bui If 

334 

6 

6 

4.81 

0 

25. PEPC_HUMAN 

PROGASTRICSIN PRECURSOR (EC 3 

383 

6 

6 

4.31 

0 

26. A29937 

♦Gastricsin precursor - Human 

383 

6 

6 

4.81 

0 

27. A31811 

♦Gastricsin precursor - Human 

383 

6 

6 

4.81 

0 

28. PEPC RAT 

GASTRICSIN PRECURSOR (EC 3.4. 

392 

6 

6 

4.81 

0 

29. A33510 

♦ Gastric sin - Rat #EC-number 

392 

6 

6 

4.81 

0 

30. A24608 

Pepsin A precursor - Rat #EC- 

392 

6 

6 

4.31 

0 

31. YAJA ECOLI 

HYPOTHETICAL 44.7 KD PROTEIN 

400 

6 

7 

4.81 

0 

32. JS0349 

Hypothetical 45K protein (sbc 

400 

6 

7 

4.81 

0 

33. CYAA_TRYEQ 

ADENYLATE CYCLASE (EC 4.6.1.1 

469 

6 

6 

4.31 

0 

34. S16359 

Adenylate cyclase - Trypanoso 

469 

6 

6 

4.81 

0 

35. 0M6E_CHLTR 

60 KD OUTER MEMBRANE PROTEIN 

547 

6 

6 

4.81 

0 

36. S13120 

♦0mp2 protein - Chlamydia tra 

547 

6 

6 

4.81 

0 

37. PC1_MOUSE 

PLASMA-CELL MEMBRANE GLYCOPRO 

871 

6 

6 

4.81 

0 

38. A27410 

Plasma cell membrane protein 

905 

6 

6 

4.81 

0 

39. CARB BACSU 

CARBAMOYL-PHOSPHATE SYNTHASE 

1071 

6 

6 

4.81 

0 

40. F39845 

♦Carbamyl phosphate synthase 
♦*## 3 standard deviations , 

1071 

above mean 

6 

6 

4.81 

0 

41. P98464 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

42. P98452 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

43. P9844S 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

44. P98444 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

45. P98468 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

46. P98460 

Sequence of C. trachomatis se 

14 

5 

5 

3.85 

0 

47. YPS1_PLEBQ 

HYPOTHETICAL 4.6 KD PROTEIN ( 

40 

. 5 

5 

3.85 

0 

48. PSBJ_CY APA 

PHOTOSYSTEM II REACTION CENTR 

40 

5 

6 

3.35 

0 

49. S08038 

Hypothetical protein 1 - Plec 

40 

5 

5 

3.85 

0 

50. PQ0104 
alignments saved 

Microbial serine proteinase I 

1. 

41 

5 

5 

3.35 

0 
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FastDE — Fast, Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-9.res made by alexk on Thu 25 Feb 93 10:39:58-PST. 


Query sequence being compared: CELSA-9 (1-16) 

Number of sequences searched: 94553 

Number of scores above cutoff: 4422 

Results of the initial comparison of CELSA-9 (1-16) with: 
Data bank : A—GeneSeq 9, all entries 
Data bank : PIR 34.- all entries 
Data bank : Swiss-Prot 23, all entries 
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I 1 1 

1 1 

1 

SCORE 

0| 

1 

12 

|3 

1 4 

1 6 

1 7 | 

3 ! 

7 

STDEV 

-1 

0 

1 

2 

3 

4 

5 6 

7 




PARAMETERS 



Similarity matrix 

Unitary 

K-tuple 


2 

Mismatch penalty 

5 

Joining penalty 


20 

Gap penalty 

1.00 

Window size 


5 

Gap size penalty 

0.26 




Cutoff score 

0 




Randomization group 

0 




Initial scores to save 50 

Alignments to save 

0 


Optimized scores to 

save 0 

Display context 

0 



SEARCH STATISTICS 


Scores; 

Mean 

Median 

Standard Deviation 


2 

3 

1.06 

Times; 

CPU 


Total Elapsed 


00;03;26.02 


00:06:54.00 

Number of 

residues; 

25433612 


Number of 

sequences searched; 

74553 


Number of 

scores above cutoff: 

4422 



Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 

The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


A 100% identical sequence to the query sequence uas not found. 


The list of best scores is; 


I nit. Opt. 

Sequence Name Description Length Score Score Sig. Frame 


1 . 

PC 1_HUMAN 

7 standard deviations 
PLASMA-CELL MEMBRANE GLYC0PR0 

above mean 
873 

10 

1 1 

7.53 

0 

2 . 

S21706 

^Pyrophosphatase - Human 

725 

10 

1 1 

7.58 

0 

3. 

A37216 

-w-Plasma cell membrane protein 

725 

10 

1 1 

7.58 

0 

4. 

VD04_F0WP1 

4 standard deviations 
25.6 KD PROTEIN. 

above mean 
218 

7 

8 

4.74 

0 

5 o 

A35216 

* F P D 4 protein - F o w 1 p o >: virus 

218 

7 

O 

u 

4.74 

0 

6 . 

R21437 

PE-40 protein cont-g. a neihio 

361 

7 

7 

4-74 

0 

7 . 

R21436 

PE-40 somatostatin substitute 

361 

7 

7 

4 - 74 

0 

8 . 

R21435 

PE-40 somatostatin substitute 

361 

7 

7 

4.74 

0 

7 . 

R20201 

TGF-alpha-PE40ab. 

417 

7 

7 

4.74 

0 

10. 

R20177 

TGF-alpha-PE40aB. 

417 

7 

7 

4.74 

0 

1 1 . 

R07054 

PE40AB protein comprising a p 

417 

/ 

7 

4.74 

0 

12. 

R20200 

TGF-alpha-PE4QAb. 

420 

7 

"7 

/ 

4 . 74 

0 

13. 

R06450 

TGF-alpha-PE40~ab m o dified ps 

4 20 

7 

~7 

f 

4,74 

0 

1 A 

O n /_ A A o 

t r rr _ ^ u o rr a a ... a -i 4 r , ... ,j ... - 

a on 

*7 

-? 

A *7 A 

r\ 






16. 

R06447 

TGF-57-Pseudomonas exotoxin 4 

420 

7 

7 

4.74 

0 

17. 

R06994 

PE40ab protein comprising a p 

420 

7 

7 

4.74 

0 

18. 

R06992 

PE40aB protein comprising a p 

420 

7 

7 

4.74 

0 

19. 

R06993 

PE40Ab protein comprising a p 

420 

*7 

7 

4.74 

0 

20. 

AMPL_B0VIN 

LEUCINE AMINOPEPTIDASE (EC 3. 

478 

7 

7 

4.74 

0 

21 . 

APBOL 

Cytosol aminop*eptidase - Bovi 

478 

7 

7 

4.74 

0 

22. 

R04934 

Immunotoxin hybrid of human i 

496 

7 

7 

4.74 

0 

23. 

R04920 

Immunoprotein PEX46. 

549 

7 

7 

4.74 

0 

24. 

R04923 

Immunoprotein TANG11. 

557 

7 

7 

4.74 

0 

25. 

R04919 

Immunoprotein PEX45. 

574 

7 

7 

4.74 

0 

26. 

R04924 

Immunoprotein TANG12. 

577 

7 

7 

4.74 

0 

27. 

TOXA_PSEAE 

EXOTOXIN A PRECURSOR (NAD-DEP 

638 

7 

7 

4.74 

0 

28. 

A30347 

*Exotoxin A precursor - Pseud 

638 

7 

7 

4.74 

0 

29. 

FEPA_EC0L1 

FERRIC ENTEROCHELIN RECEPTOR 

745 

7 

7 

4.74 

0 

30. 

QRECFC 

Ferrienterochelin receptor pr 
###■» 3 standard deviations « 

745 

above mean 

7 

7 

4.74 

0 

31 . 

PN008S 

^Matrix protein M3 - Avian in 

68 

6 

7 

3.79 

0 

32. 

PN0085 

♦Matrix protein M3 - Avian in 

68 

6 

7 

3.79 

0 

33. 

ISS.ECOLI 

HYPOTHETICAL ISS PROTEIN. 

102 

6 

6 

3.79 

0 

34. 

UE0036 

Hypothetical iss protein - Es 

102 

6 

6 

3.79 

0 

35. 

ALB2_PEA 

ALBUMIN 2 (PA2). 

231 

6 

6 

3.79 

0 

36. 

S06243 

Albumin 2 - Garden pea 

231 

6 

6 

3.79 

0 

37. 

JQ0319 

Hypothetical 27K protein - Xa 

261 

6 

6 

3.79 

0 

38. 

ETB_STAAU 

EXFOLIATIVE TOXIN B PRECURSOR 

277 

6 

7 

3.79 

0 

39. 

PRSAEB 

Epidermo1ytic toxin B precurs 

277 

6 

"7 

/ 

3.79 

0 

40. 

MINI_CHICK 

MYELOID PROTEIN-1 PRECURSOR. 

326 

6 

6 

3.79 

0 

41 . 

A33755 

*myb-induced myeloid protein 

326 

6 

6 

3.79 

0 

42. 

BYR1 SCHPO 

PROTEIN KINASE BYR1 (EC 2.7.1 

340 

6 

6 

3.79 

0 

43. 

0KBYR1 

Protein kinase byrl - Yeast ( 

340 

6 

6 

3.79 

0 

44. 

YPIX_CL0PE 

HYPOTHETICAL 38.4 KD PROTEIN 

342 

6 

7 

3.79 

0 

45. 

JT0370 

Hypothetical protein glO - Cl 

342 

6 

7 

3.79 

0 

46. 

VMAT_SYNV 

MATRIX PROTEIN (M2 PROTEIN). 

345 

6 

7 

3.79 

0 

47 . 

MFVNSY 

Matrix protein - Sonchus yell 

345 

6 

7 

3.79 

0 

43. 

K1 CR_M0USE 

KERATIN, TYPE I CYT03KELETAL 

422 

6 

7 

3.79 

0 

49. 

JT 0406 

Keratin, type 1 c yto skeletal 

423 

6 

7 

3.79 

0 

50. A25621 Endo B cytokeratin - Mouse 

No alignments saved. 

423 

6 

7 

3.79 

0 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celasa-lO.res made by a 1exk on Thu 25 Feb 93 10:32:32-PST. 


Query sequence being compared: CELASA-10 (1-12) 

Number of sequences searched: 94553 

Number of scores above cutoff: 4430 

Results of the initial comparison of CELASA-10 (1-12) with: 
Data bank : A-GeneSeq 9. all entries 
Data bank : PIR 34. all entries 
Data bank : Swiss-Prot 23. all entries 
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7 


PARAMETERS 


Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 


Unitary 

5 

1.00 
0.26 
0 
0 


K-tuple 

Joining penalty 
Window size 


2 

!0 

5 


Initial scores to save 50 

Optimized scores to save 0 


Alignments to save 
Display context 


0 

0 


Scores 


Times? 


SEARCH STATISTICS 

Mean Median 

2 3 


CPU 

00 ? 03 ? 25.96 


Standard Deviation 
1 . 10 

Total Elapsed 
00:07:28.00 


25433612 

94553 

4430 


Number of residues? 

Number of sequences searched? 
Number of scores above cutoff 

Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 


The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


1007. identical sequence to the query sequence was not found. 


The list of best scores is? 


Sequence Name 

Description 

Length 

I n i t. 
Score 

Opt. 
Score 

Sig. Frame 

i . 

WMBEA1 

4 standard deviations 
Ribonucleoside-diphosphate re 

above mean 

321 7 7 

4.56 

0 

2 „ 

A36062 

^Catalase — Maize # E C - number 

491 

7 

8 

4-56 

0 

3. 

S18S19 

*Catalase - Maize #EC-number 

491 

7 

8 

4.56 

0 

4. 

CAT A_IP0BA 

CATALASE (EC 1.11.1.6). 

492 

7 

8 

4-56 

0 

5. 

CAT 1_G0SHI 

CATALASE ISOZYME A (EC 1.11.1 

492 

7 

8 

4.56 

0 

6 . 

S07124 

^Catalase - Sweet potato #EC- 

492 

7 

8 

4.56 

0 

7 . 

S20999 

fcCatalase - Soybean #EC-numbe 

492 

7 

o 

4.56 

0 

8 ■ 

S 1 0770 

^Catalase - Upland cotton #EC 

492 

7 

8 

4.56 

0 

9. 

S10395 

Catalase chain 1 - Upland cot 

492 

7 

8 

4.56 

0 

10 . 

CATA PEA 

CATALASE (EC 1.11»1.6). 

494 

7 

8 

4.56 

0 

1 1 . 

S1 8346 

Catalase - Garden pea ftEC-num 

494 

7 

8 

4.56 

0 

12 . 

C20554 

* * * # 3 standard deviations 

Henocyanin Lplla - Atlantic h 

above ne 
20 

an ## 

6 6 

3.65 

0 

13. 

A 2 0 5 5 4 

Henocyanin LpI chain ~ A11 ant 

24 

6 

6 

3.65 

0 

14. 

F 2 0 5 5 4 

H w (Ti n r i j a t"i i n ! r~, T ! , } — A 1 1 ■*. 4 -5 t- 


i 







1O. DCJ.JOT 

*uinh gyrase chain B - Lyme di 

84 

6 

6 

3.65 

0 

17. A2S002 

Apol ipoproteiri B-48 - Human ( 

106 

6 

6 

3.65 

0 

18. JN0145 

Hypothetical 13.6K protein (d 

117 

6 

7 

3.65 

0 

19. HEM4 BACSU 

PUTATIVE UR0P0RPHYRIN0GEN-III 

130 

6 

6 

3.65 

0 

20. D35252 

^Putative hemD protein - Baci 

130 

6 

6 

3.65 

0 

21. P82320 

PAP-III isolated from biologi 

145 

6 

6 

3.65 

0 

22. JT 0961 

Glutathione synthase large ch 

285 

6 

6 

3.65 

0 

23. CENA_MOUSE 

CENTROSOMIN A. 

289 

6 

6 

3.65 

0 

24. S13800 

#Centrc>somin A - Mouse 

289 

6 

6 

3.65 

0 

25. R22403 

Partial sequence of N-lipocor 

304 

6 

6 

3.65 

0 

26. B41002 

Annex in II (clones E4 and F4) 

314 

6 

6 

3.65 

0 

27. R10689 

Cephalosporin antibiotic bios 

319 

6 

6 

3.65 

0 

28. ANX3_HUMAN 

ANNEXIN III (LIPOCORTIN III) 

323 

6 

6 

3.65 

0 

29. LUHU3 

Annex in III - Human 

323 

6 

6 

3.65 

0 

30. P91362 

Human lipocortin-III. 

323 

6 

6 

3.65 

0 

31. ANX3 RAT 

ANNEXIN III (LIPOCORTIN III) 

324 

6 

6 

3.65 

0 

32. LURT3 

Annex in III - Rat 

324 

6 

6 

3.65 

0 

33. VG16_BPPZA 

ENCAPSIDATION PROTEIN (LATE P 

332 

6 

6 

3.65 

0 

34. VG16_BPPH2 

ENCAPSIDATION PROTEIN (LATE P 

332 

6 

6 

3.65 

0 

35. JQ0168 

Gene 16 protein - Phage phi-2 

332 

6 

6 

3.65 

0 

36. WMBP26 

Gene 16 protein - Phage phi-2 

332 

6 

6 

3.65 

0 

37. WMBP16 

Gene 16 protein - Phage PZA 

332 

6 

6 

3.65 

0 

38. RIR2_HSV23 

RIBONUCLEOSIDE-DIPHOSPHATE RE 

337 

6 

6 

3.65 

0 

39. WMBEi2 

Ribonuc1eoside-diphosphate re 

337 

6 

6 

3.65 

0 

40. WMBE32 

Ribonucleoside—diphosphate re 

337 

6 

6 

3.65 

0 

41. ANX2_M0USE 

ANNEXIN II (LIPOCORTIN II) <C 

338 

6 

6 

3.65 

0 

42. ANX2_HUMAN 

ANNEXIN II (LIPOCORTIN II) (C 

338 

6 

6 

3.65 

0 

43. ANX2 CHICK 

ANNEXIN II (LIPOCORTIN II) (C 

338 

6 

6 

3.65 

0 

44. ANX2_B0VIN 

ANNEXIN II (LIPOCORTIN II) (C 

338 

6 

6 

3.65 

0 

45. RIR2_HSV1K 

RIBONUCLEOSIDE-DIPHOSPHATE RE 

339 

6 

6 

3.65 

0 

46. ANXBlxENLA 

ANNEXIN II TYPE I (LIPOCORTIN 

339 

6 

6 

3.65 

0 

47. ANX2_XENLA 

ANNEXIN II TYPE II (LIPOCORTI 

339 

6 

6 

3.65 

0 

48. LUCH2 

Annex in II - Chicken 

339 

6 

6 

3.65 

0 

49. LUMS36 

Annex in II - Mouse 

339 

6 

6 

3.65 

0 

50. LUB036 

No alignments saved 

Annex in II - Bovine 

1 . 

339 

6 

6 

3.65 

0 
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F a s t D B Fast Pair-wise Comparison 

Release 5.4 

Results File celsa—ll.res made by 


of Sequences 

ale>:k on Thu 25 Feb 93 10 ; 24 ; 30-PST . 


Query sequence being compared; 
Number of sequences searched; 
Number of scores above cutoff; 


CELSA-ll (1-23) 
94553 
3948 


Results of the initial comparison nf CELSA-ll 
Data bank : A-GeneSeq 9, all entries 
Data bank ; PIR 34* all entries 
Data bank : Suiss-Prot 23, all entries 


< 1 -23) with; 


100000 - 


N 

U50000- 

M 

B 

E 

R 

0 

F10000- 

s 

E 5000- 

Q 

U 

E 

N 

C 

E 

s 1000- 


#• 


* 


* 




500- 


* 


* 


100 - 

50- 

«• 


10 - 


I 


I 



Best Available Copy 


I I 

SCORE 0| 
STDEV -1 


1 I 
0 


I I 

3| 


I I 
4| 
3 


I I 
5| 
4 


I I 


PARAMETERS 


Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 


Unitary 

5 

1.00 
0.26 
0 
0 


Initial scores to save 50 

Optimized scores to save 0 


K-tuple 

Joining penalty 
Window size 


Alignments to save 
Display context 


2 

20 

5 


0 

0 


SEARCH STATISTICS 


Scores: 



Mean 

Median 




2 

4 

Times ? 



CPU 





00:03:59.10 


Number 

of 

residues: 


25433612 

Number 

of 

sequences 

searched: 

94553 

Number 

of 

scores above cutoff: 

3948 

Cut-off 

raised to 2 . 




Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 
Cut-off raised to 6 . 


Standard Deviation 
1.13 

Total Elapsed 
00:08:28.00 


The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


A 100% identical sequence to the query sequence was not found. 
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The list of best scores is: 


I nit. Opt. 


Sequence Name 

Description 

Length Score Score 

Sig . 1 

"rani 

1 . 

EFTU_EUGGR 

#•*■«•* 6 standard deviations 

ELONGATION FACTOR TU (EF-TU). 

above mean 
409 

9 

10 

6.18 

0 

a. 

S02254 

Elongation factor Tu - Euglen 

409 

9 

10 

6.18 

0 

3 . 

EFEGT 

Elongation factor Tu - Euglen 

409 

9 

10 

6.18 

0 

4 . 

XYLR_ST AXY 

5 standard deviations 
XYLOSE REPRESSOR. 

above mean 
383 

O 

u 

s 

5.29 

0 

5 . 

S16529 

#xylR protein - Siaphylococcu 

383 

3 

8 

5.29 

0 

6 . 

YDP3__L ACLA 

4 standard deviations 
HYPOTHETICAL 13-0 KD PROTEIN 

above no an 

161 

7 

10 

4.41 

0 

7. 

B33374 

^Hypothetical protein EVRFi - 

193 

7 

9 

4.41 

0 

8 . 

YEIB_ECGLI 

HYPOTHETICAL PROTEIN IN GALS 

196 

7 

7 

4.41 

0 

9. 

S19934 

H y p o t h e t i c a 1 p r o t e i n - E s c h e r 

196 

7 

*7 

t 

4.41 

0 

10 . 

UL52.HSVSA 

HYPOTHETICAL BSLF1 PROTEIN HO 

205 

7 

1 

i 

4.41 

0 

1 1 . 

QQBEHA 

B S L F 1 p r o t e 1 n - S a 1 n i r i n e h e r 

205 

7 

! 

4.41 

0 

1 P 

k a n PAPnr 

A n P W V ! A T c w T M A c t: r c r '7 n -z \ 

O 1 '7 

~7 

-7 

A A 1 

n 





i *t . Aire 

Adenylate kinase - Paracoccus 

217 

7 

7 

4.41 

15. KAD1_YEAST 

ADENYLATE KINASE CYTOSOLIC (E 

222 

7 

O 

L J 

4.41 

16. KIBYA 

Adenylate kinase - Yeast (Sac 

222 

"7 

/ 

8 

4.41 

17. R03339 

VP1 sequence for HRV serotype 

237 

7 

9 

4.41 

18. R03340 

VP 1 sequence for HRV serotype 

291 

7 

8 

4.41 

19. CMG1_BACSU 

COMG OPERON PROTEIN 1 . 

356 

7 

7 

4.41 

20. B3033S 

♦comG operon protein 1 - Baci 

356 

7 

7 

4.41 

21. RRP0_LYCVW 

RNA POLYMERASE (EC 2.7.7.43) 

363 

7 

9 

4.41 

22. DDLA_EC0LI 

D-ALANINE—D-ALANINE LIGASE A 

364 

7 

7 

4.41 

23. CEECDA 

D-Alanine—D-alanine ligase A 

364 

7 

7 

4.41 

24. TRPB_ACICA 

TRYPTOPHAN SYNTHASE BETA CHAI 

403 

7 

8 

4.41 

25. B36151 

♦Tryptophan synthase beta cha 

403 

7 

8 

4.41 

26. FIBG_B0VIN 

FIBRINOGEN GAMMA-B CHAIN PREC 

444 

7 

9 

4.41 

27. S05313 

Fibrinogen ganna-B chain prec 

444 

7 

9 

4.41 

28. FUCO DICDI 

ALPHA-L-FUCOSIDASE PRECURSOR 

461 

7 

10 

4.41 

29. A30364 

alpha-L-Fucosidase homolog pr 

461 

7 

10 

4.41 

30. A41533 

♦Reticuline oxidase precursor 

533 

7 

7 

4.41 

31. VGNZPD 

Fusion glycoprotein precursor 

631 

7 

8 

4.41 

32. NU5M_ASPNI 

NADH-UBIQUINONE OX IDOREDUCTAS 

657 

7 

8 

4.41 

33. S04724 

♦NADH dehydrogenase (ubiquino 

657 

7 

8 

4.41 

34. UL15_HSV6U 

HYPOTHETICAL PROTEIN 12L (ORF 

667 

7 

9 

4.41 

35. 0QBEH6 

12L protein - Human herpesvir 

667 

7 . 

9 

4.41 

36. CAP 1_HUMAN 

CALPAIN I, LARGE (CATALYTIC) 

714 

7 

9 

4.41 

37. S10591 

♦Cysteine protease - Human 

714 

7 

9 

4.41 

38. CI HUH 

Ca1 pain I heavy chain - Human 

714 

7 

9 

4.41 

39. POLG_HRV 1 A 

GENOME POLYPROTEIN (COAT PROT 

832 

7 

9 

4.41 

40. VCAP_HSV 11 

MAJOR CAPSID PROTEIN (MCP). 

1374 

7 

7 

4.41 

41. A30084 

♦Gene UL19 protein (major cap 

1374 

7 

7 

4.41 

42. VCBE17 

Major capsid protein - Human 

1374 

7 

7 

4.41 

43. ARO 1 _YEAST 

PENTAFUNCTIONAL AROM POLYPEPT 

1588 

7 

3 

4.41 

44. BVBYA 1 

AR01 protein - Yeast (Sacchar 

1538 

7 

8 

4.41 

45. S18644 

♦Multifunctional aninoacyl-tR 

1714 

7 

8 

4.41 

46. RRPO_TACV 

RNA POLYMERASE (EC 2.7.7.48). 

2210 

7 

8 

o 

4.41 

47. RRXPTV 

RNA-directed RNA polymerase - 

2210 

7 

4.41 

48. S220 11 

♦ 6 -Deoxyerythronolide B synth 

3567 

7 

8 

4.41 

49. A41819 

♦637K proline-rich peptide pr 5762 

3 standard deviations above mean 

7 

9 

4.41 

50. R15527 

No alignments saved 

Immunopeptide derived from HP 

i. 

20 

6 

6 

3.53 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 




0 
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FastDB — Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-12.res made by alexk on Thu 25 Feb 93 10:32 ? 32-PST. 


Query sequence being compared? CELSA-12 (1-9) 

Number of sequences searched? 94553 

Number of scores above cutoff? 3845 

Results of the initial comparison of CELSA-12 (1-9) with? 
Data bank ? A—GeneSeq 9* all entries 
Data bank ? PIR 34» all entries 
Data bank ? Swiss—Prot 23 > all entries 

100000 - 

N - * 

U50000- 

M 

B 

E 

R * # 

0 

F10000- 
S 

E 5000- 

Q 

U 

E * 

N 

C 

E 

S 1000- 


500- 


* 


«■ 


100 - 

50- 


10- 






Best Available Copy 


0— 

I I 

SCORE 0| 
STDEV 0 




3 

3 


4 I 5 | 5 6 

■4 5 


PARAMETERS 

Unitary K-tuple 

5 Joining penalty 

1.00 Window size 

0.26 
0 
0 


Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 

Initial scores to save 
Optimized scores to save 


50 Alignments to save 

0 Display context 


2 

20 

5 


0 

0 


SEARCH STATISTICS 


Scores: Mean 

1 

Times: CPU 

00:03:23.11 

Number of residues: 

Number of sequences searched: 
Number of scores above cutoff: 

Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 


Median Standard Deviation 

3 1.04 

T ota 1 Elap sed 
00:07:28.00 

25433612 

94553 

3845 


The scores below are sorted by initial score. 
Significance is calculated based on initial score 


A .100% identical sequence to the query sequence was not found. 
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The list of best scores is: 


Sequence Name 


1. CYB_0ENBE 

2. S20141 

3. CB0BE 

4. R13364 

5. P92042 

6. P90159 

7. S21305 

3. CYB_CHLSM 
9. CYB_CHLRE 

10. S12023 

11. S11966 

12. S12016 

13. CYB_MAIZE 

14. CBZM 


Description 


Length 

I n i t. 
Score 

Opt. 
Score 

###* 5 standard deviations 

above mean •#■### 

CYTOCHROME 

B (EC 1.10.2.2). 

394 

7 

7 

Cytochrone 

h - Evening p r i rt r o 

394 

7 

7 

Cytochrome 

b - Evening prinro 

394 

7 

7 


* * * * 4 standard deviations above mean ihh* # 

P691 HCV antigen (691-714). 

Sequence encoded in the hepat 
Sequence of hepatitis C virus 
#Apocytochrome b - Rice 
CYTOCHROME B (EC 1.10.2.2). 

CYTOCHROME B (EC 1.10.2.2). 

^Cytochrome b - Ch1amydomonas 
^Cytochrome b - Ch1amydomonas 
^Cytochrome b - Ch1amydomonas 
CYTOCHROME B (EC 1.10.2.2). 

C. i: t. r> r In *■-- rr. v-- — m ^ ^ * „• j. „ „ i.. „ ... 


24 

141 

141 

334 

331 

381 

381 

381 

331 


6 

6 

6 

6 

6 

6 

6 

6 


6 

6 

6 

6 

6 

6 

6 

6 

6 


Sig. Frame 


5.76 0 

5.76 0 

5.76 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4.80 0 

4 8 0 0 





No 


J. !_• V I * 


17. 

S 17427 

18. 

CYB_0RYSA 

19. 

S20659 

20. 

CBRZ 

21 . 

CYB_WHEAT 

22. 

A22931 

23. 

CYB_MARP0 

24. 

CARA_YEAST 

25. 

SYBYCS 

26. 

S20660 

27. 

P92049 

28. 

P90183 

29. 

R21577 

30. 

JQ1366 

31 . 

YCD9 YEAST 

32. 

S19367 

33. 

P90164 

34. 

P92047 

35. 

P92050 

36. 

P9028S 

37. 

R08123 

38. 

R24440 

39. 

R08124 

40. 

P0LG_HCVH 

41 . 

P0LG_HCV1 

42. 

GNWVC3 

43. 

R22154 

44. 

R21519 

45. 

R15366 

46. 

R1 1408 

47. 

A24265 

48. 

A25521 

49. 

S19988 

50. 

S19981 


alignments saved 


cytocnrome b - Fava bean mito 392 
♦Cytochrome b - Potato mitoch 393 
CYTOCHROME B <EC 1.10.2.2). 397 
♦Apocytochrome b - Rice 397 
Cytochrome b - Rice mitochond 397 
CYTOCHROME B (EC 1.10.2.2). 398 
Cytochrome b - Wheat mitochon 398 
CYTOCHROME B (EC 1.10.2.2). 404 
CARBAMOYL-PHOSPHATE SYNTHASE, 411 
Carbamoyl-phosphate synthase 411 
♦Pseudo apocytochrome b - Ric 449 
Sequence encoded by segment o 454 
Sequence of hepatitis C virus 454 
HCV CKS-NS1 - pHCV-107. 467 
Polyprotein - Hepatitis C vir- 716 
HYPOTHETICAL 86.0 KD PROTEIN 759 
Hypothetical protein YCL39W - 759 
Peptide encoded by composite 2261 
Sequence encoded in the hepat 2301 
Sequence encoded in the hepat 2436 
Peptide encoded by composite 2462 
Hepatitis C virus polypeptide 2772 
Composite HCV HC-J1/CDC/CHI p 2894 


Hepatitis C virus putative po 2955 

GENOME P0LYPR0TEIN (CONTAINS; 3011 

GENOME POLYPROTEIN (CONTAINS; 3011 

Genome polyprotein - Hepatiti 3011 

NANBV Hutch c59 isolate genom 3011 

Compiled HCV sequence. 3011 

**** 3 standard deviations above mean 

Ig idiotypic determinant PSL3 12 

Hepatitis B virus pre-S2 QE2- 38 

Phosphate transport protein, 47 

Ig kappa chain V region (321) 54 

♦Hypothetical protein 1708 - 65 

♦Hypothetical protein 1708 - 65 


6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

■ 6 
6 
6 
6 
6 
6 
6 

#*** 

5 

5 

5 

5 

5 

5 


6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

5 

5 

5 

6 


4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

4.80 

3.84 

3.84 

3.84 

3.84 

3.84 

3.84 
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FastDB - Fast Pairwise Comparisen of Sequences 
Release 5.4 

Results file celsa-13.res made by alexk on Thu 25 Feb 93 10:39:48-PST. 


Query sequence being compared: CELSA-13 (1-5) 

Number of sequences searched: 94553 

Number of scores above cutoff: 4106 

Results of the initial comparison of CELSA-13 (1-5) with: 
Data bank : A-GeneSeq 9, all entries 
Data bank : PIR 34* all entries 
Data bank : Swiss-Prot 23* all entries 

100000 - 

N - * 

U50000- 

M 

B * 

E 

R 

0 

F10000- * * 

S 

E 5000- 

Q 

U 

E 

N 

C 

E 

S 1000- 


500- * 


- * 
100 - 


50- 


10- 



Best Available Copy 


0 — 
11 

SCORE 0| 
STDEV 0 


I I 
I 1 
1 


1 

2 



III I I 

3 | 3 4 4 

3 


PARAMETERS 


Similarity matrix 


Unit ary 

K-tup1e 

2 

Mismatch penalty 


5 

Joining penalty 

20 

Gap penalty 


1.00 

Window size 

5 

Gap size penalty 


0.26 



Cutoff score 


0 



Randomization group 


. 0 



Initial scores to save 

50 

Alignments to save 

0 

Optimized scores to 

save 0 

Display context 

0 



SEARCH STATISTICS 


Scores: 


Mean 

Median Standard Deviation 



1 

3 1.01 


Times: 


CPU 

Total Elapsed 



00 

:03:18.07 

00:06:44.00 


Number of residues: 



25433612 


Number of sequences 

searched: 

94553 


Number of scores above 

cutoff: 

4106 



Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 


The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


1 1 10051 similar sequences to the query sequence were found: 


I nit. Opt. 


Sequence Name 

Description 

Length Score 

Score 

Sig. 1 

-name 

1 . 

A371 13 

*Ryanodine receptor, cardiac 

4969 5 

5 

3.98 

0 

2. 

BVFFSL 

sol protein, large splice for 

1597 5 

5 

3.98 

0 

3. 

S0L_DR0ME 

SMALL OPTIC LOBES PROTEIN. 

1597 5 

5 

3.98 

0 

4. 

CPS 1_CANTR 

CYTOCHROME P450 LI (P450-L1A1 

523 5 

5 

3.98 

0 

5 » 

A3 1 854 

Cytochrome P450 51 lanosterol 

52S 5 

5 

3.98 

0 

6. 

S1 2042 

#Sugar transport protein STP1 

522 5 

5 

3.98 

0 

7. 

STP1_ARATH 

GLUCOSE TRANSPORTER (SUGAR CA 

522 5 

5 

3.98 

0 

8. 

S1 4627 

^Glucose transport protein - 

522 5 

5 

3.98 

0 

9. 

B35901 

^Calcium channel alpha-1 chai 

164 5 

5 

3.93 

0 

10. 

JQ0700 

Hypothetical 11K protein (mmo 

103 5 

5 

3.98 

0 

1 1 . 

YMM0_METCA 

HYPOTHETICAL 11,9 KD PROTEIN 

103 5 

5 

3.98 

0 

The list of other 

Sequence Name 

best scores is? 

Des c ription 

I n i t , 
Length Score 

Opt . 

S c o r e 

Sig. 

Frame 

1 o 

7j /) “7 1 ir. 

2 standard deviations 

-• Y , , 4. - A U ; - J. - - - --- 

a h o v e m e a n # * -k- # 

n /! n 

'-i rt /'I 

r\ 






14. D24735 

Glutathione transferase r 2-2 

IV 


H 

£.70 

U 

15. A24735 

Glutathione transferase r 1-1 

26 

4 

4 

2.98 

0 

16. 521278 

♦Glutathione transferase chai 

28 

4 

4 

2.98 

0 

17. S0335S 

♦Glutathione transferase - Ra 

31 

4 

4 

2 . 98 

0 

18. S09585 

♦Glutathione transferase - Ra 

31 

4 

4 

2 . 93 

0 

19. V07K_PMV 

7 KD PROTEIN <0RF 4). 

68 

4 

4 

2.98 

0 

20. JQ0099 

Hypothetical 7K protein - Pap 

68 

4 

4 

2.98 

0 

21. PS 1053 

Sequence of rhinovirus HRV2 v 

69 

4 

4 

2.98 

0 

22. P81052 

Sequence of rhinovirus HRV89 

69 

4 

4 

2.93 

0 

23. TRAL_EC0LI 

TRAL PROTEIN. 

91 

4 

4 

2.98 

0 

24. BVECTL 

traL protein - Escherichia co 

91 

4 

4 

2.98 

0 

25. S03311 

traL protein - Escherichia co 

92 

4 

4 

2.98 

0 

26. VE5_HPV42 

PROBABLE E5 PROTEIN. 

95 

4 

4 

2.98 

0 

27. W5WL42 

E5 protein - Human papillomav 

95 

4 

4 

2.93 

0 

28. YR12_CY APA 

HYPOTHETICAL 12.3 KD PROTEIN 

102 

4 

4 

2.98 

0 

29. S12809 

Hypothetical protein <rpl3 5' 

102 

4 

4 

2.98 

0 

30. 514355 

♦Glutathione transferase - Ra 

113 

4 

4 

2 . 98 

0 

31. THI0_ANASP 

THIOREDOXIN. 

115 

4 

4 

2.98 

0 

32. A32233 

♦Thioredoxin - Anabaena sp. 

115 

4 

4 

2.98 

0 

33. VG67_BPPZA 

EARLY PROTEIN GP16.7. 

130 

4 

4 

2.93 

0 

34. VG67_BPPH5 

EARLY PROTEIN GP16.7. 

130 

4 

4 

2.93 

0 

35. VG67_BPPH2 

EARLY PROTEIN GP16.7. 

130 

4 

4 

2.98 

0 

36. JN0033 

Early protein gpl5 - Phage ph 

130 

4 

4 

2.98 

0 

37. WRBPF5 

Early protein gpl5 - Phage ph 

130 

4 

4 

2 . 93 

0 

38. WRBP67 

Early protein gpl5 - Phage PZ 

130 

4 

4 

2.98 

0 

39. YKP7_KLULA 

HYPOTHETICAL KILLER PLASMID P 

132 

4 

4 

2.98 

0 

40. S15966 

♦Hypothetical protein 7 - Yea 

132 

4 

4 

2.98 

0 

41. S00965 

Hypothetical protein 7 - Yeas 

132 

4 

4 

2.98 

0 

42. S19269 

♦Glutathione transferase Yc2 

139 

4 

4 

2.98 

0 

43. BCP_ECOLI 

BACTERIOFERRITIN COMIGRATORY 

156 

4 

4 

2.98 

' 0 

44. V02_FMVD 

PROTEIN 2. 

164 

4 

4 

2.98 

0 

45. SOI 280 

Hypothetical protein 2 - Figw 

164 

4 

4 

2.98 

0 

46. NU6M_M0USE 

NADH-UBIQUINONE OX IDOREDUCTAS 

172 

4 

4 

2.98 

0 

47. DEMSN6 

NADH dehydrogenase (ubiquinon 

172 

4 

4 

2.98 

0 

48. PSP_BOVIN 

PANCREATIC STONE PROTEIN PREC 

175 

4 

4 

2.98 

0 

49. A37194 

♦Pancreatic thread protein pr 

175 

4 

4 

2.98 

0 

50. JS0679 
lo alignments save 

Hypothetical 20K protein - Hu 

d . 

175 

4 

4 

2.93 

0 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-14.res made by alexk on Thu 25 Feb 93 10*28*57-PST. 


Query sequence being compared* 
Number of sequences searched* 
Number of scores above cutoff* 


CEL3A-14 (1-5) 
94553 
4593 


Results of the initial comparison of CELSA-14 (1-5) with* 
Data bank * A—GeneSeq 9. all entries 
Data bank * PIR 34r all entries 
Data bank * Swiss-Prot 23, all entries 


100000 - 

N 

U50000- 

M 

B * 

E 

R 

0 

F10000- 


S 

E 

Q 

U 

E 

N 

C 

E 

S 


5000- 


1000 - 


* 


*• 


* 


500- 




100 - * 

so¬ 


lo- 


U “ 



Best Available Copy 


0 


KJ 

I 1 

1 

1 1 

1 

1 1 

i 

1 i 

1 

1 

SCORE 0| 

1 

|1 

2 

1 2 

3 

1 3 

4 

4 

STDEV 0 


1 


o 


3 





PARAMETERS 



Similarity matrix 

Unitary 

K-tuple 


2 

Mismatch penalty 

5 

Joining penalty 

20 

Gap penalty 

1.00 

Window size 


5 

Gap size penalty 

0.26 




Cutoff score 

0 




Randomization group 

0 




Initial scores to save 50 

A1ignments 

to save 

0 

Optimized scores to 

s ave 0 

Display context 

0 


SEARCH STATISTICS 



Scores : 

Mean 

Median 

Standard Deviation 


1 

3 

1.05 


Times: 

CPU 


Total Elapsed 



oo;03;is.06 


00:06:52.00 


Number of residues; 


25433612 



Number of sequences 

searched; 

94553 



Number of scores above cutoff; 

4598 




Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 


The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


145 1007. similar sequences to the query sequence were 


\ao 




found ’ 






Init. Opt. 


Sequence Name 

Description 

Length 

Score 

Score 

Sig. Frame 

1 . 

S18268 

■fralpha-Aninoadipyl-L-cysteiny 

3649 

5 

5 

3.82 

0 

2. 

ACVS_N0CLA 

L- < ALPHA-AMINOADIPYL)-L-CYSTE 

3649 

5 

5 

3.82 

0 

3. 

S12332 

Ubiquitin—protein ligase - Y 

1950 

5 

5 

3.82 

0 

4. 

UBR1_YEA5T 

N-END-RECOGNIZING PROTEIN <UB 

1950 

5 

5 

3.32 

0 

5. 

RRWGNV 

RNA-directed RNA polymerase - 

1643 

5 

5 

3.82 

0 

6. 

V0R1 NMV 

186 KD PROTEIN <0RF 1 ) . 

1643 

5 

5 

3.82 

0 

7. 

WMWGPV 

RNA-directed RNA polynerase ~ 

1456 

5 

5 

3.32 

0 

8. 

S 1 4005 

^Hypothetical protein? 166K - 

1456 

5 

5 

3.82 

0 

9. 

V0R1_PVX 

165 HD PROTEIN <0RF 1 ) . 

1456 

5 

5 

3.82 

0 

10. 

V0R1_PVXCP 

165 KD PROTEIN (0RF 1 ) . 

1456 

5 

5 

3.82 

0 

11. 

V0R1 ~PVX X3 

165 KD PROTEIN (0RF 1 ) . 

1456 

5 

5 

3.82 

0 

1 2 . 

VGIHJ2 

E2 g1ycoprotein precursor - M 

1376 

5 

5 

3.82 

0 

13. 

VGL2..CVM4 

E2 GLYCOPROTEIN PRECURSOR (SP 

1376 

5 

5 

3.82 

0 

14 . 

VGIH59 

E2 glycoprotein precursor - M 

1324 

5 

5 

3.82 

0 

15. 

VGL2_CVMA5 

E2 GLYCOPROTEIN PRECURSOR (SP 

1324 

5 

5 

3.82 

0 

16. 

VGIHMJ 

E2 glycoprotein precursor - M 

1235 

5 

5 

3.82 

0 

17. 

VGL2_CVMJH 

E2 GLYCOPROTEIN PRECURSOR (SP 

1235 

5 

5 

3.82 

0 

13 . 

A 4 6 9 8 6 

#M-cadherin - Mouse (fragment 

730 

5 


a o 

■ UJ- 

0 

1 o 

CAUII/IC 

r 1 1 C ) 1 -P .TO -It .tl Vt -f 1 c-j.n -3, P P Vi |j 

"■* p p 

5 

Pi 

3 „ P.P 

0 






C. 1 . H07 7D7 

i l surface antigen 

4F2 he a 

539 

5 

5 

3.82 

0 

2S. 4F3_HUMAN 

4F3 CELL-SURFACE ANTIGEN HEAV 

529 

5 

5 

3.82 

0 

33. S20477 

R i bu 1ose-bisphosphate 

c arboxy 

498 

5 

5 

3.82 

0 

34. RKFPLP 

Ribulose-bisphosphate 

c arboxy 

485 

5 

5 

3.82 

0 

35. RKFPLB 

R i bu l o se-b i.spho sph ate 

c arboxy 

435 

5 

5 

3.82 

0 

36. D34931 

*Ribulose-bisphosphati 

e carbox 

485 

5 

5 

3.82 

0 

37. RBL FLABI 

RIBULOSE BISPHOSPHATE 

CARBOXY 

485 

5 

5 

3.82 

0 

38. C34921 

♦Ribulose-bispho sphati 

e car-box 

485 

5 

5 

3.82 

0 

39. RBL_FLAPR 

RIBULOSE BISPHOSPHATE 

CARBOXY 

485 

5 

5 

3.82 

l o 

30. RBL_PHYAM 

RIBULOSE BISPHOSPHATE 

CARBOXY 

482 

5 

5 

3.82 

0 

31. RBL_STEHA 

RIBULOSE BISPHOSPHATE 

CARBOXY 

482 

5 

5 

3.82 

0 

33. RBL_ALLPR 

RIBULOSE BISPHOSPHATE 

CARBOXY 

430 

5 

5 

3.82 

0 

33. S30S47 

♦Ribulose-bi sphosphate car-box 

480 

5 

5 

3.82 

0 

34. RBL_BASAL 

RIBULOSE BISPHOSPHATE 

CARBOXY 

480 

5 

5 

3.82 

0 

35. RBL_M0LVE 

RIBULOSE BISPHOSPHATE 

CARBOXY 

480 

5 

5 

3.82 

0 

36. RBL_RIVHU 

RIBULOSE BISPHOSPHATE 

CARBOXY 

480 

5 

5 

3.82 

0 

37. RBL_TRIP0 

RIBULOSE BISPHOSPHATE 

CARBOXY 

480 

5 

5 

3.82 

0 

38. RKNULT 

Ribu 1 ose-bi sp>hosph ate 

car-boxy 

478 

5 

5 

3.82 

0 

39. G34931 

♦Ribulose-bisphosphati 

=■ car-box 

478 

5 

5 

3.82 

0 

40. RBL_NEUMU 

RIBULOSE BISPHOSPHATE 

CARBOXY 

478 

5 

5 

3.82 

0 

41. RBL_NEUTE 

RIBULOSE BISPHOSPHATE 

CARBOXY 

478 

5 

5 

3 . 82 

0 

43. RKNULM 

Ribulose-bisphosphate 

c arboxy 

478 

5 

5 

3.82 

0 

43. H34931 

♦Ribulose-bisphosphat> 

? carbox 

478 

5 

5 

3.82 

0 

44. P50585 

Maize ribulose-biphosphate ca 

477 

5 

5 

3.82 

0 

45. RKNTLO 

Ribulose-bisphosphate 

c arboxy 

477 

5 

5 

3.82 

0 

46. RKNTLA 

Ribulose-bisphosphate 

c arboxy 

477 

5 

5 

3.82 

0 

47. RKRZL 

Ribulose-bisphosphate 

c arboxy 

477 

5 

5 

3.82 

0 

48. RBL_NICAC 

RIBULOSE BISPHOSPHATE 

CARBOXY 

477 

5 

5 

3.82 

0 

49. RBL_NI COT 

RIBULOSE BISPHOSPHATE 

CARBOXY 

477 

5 

5 

3.82 

0 

50. RBL_ORYSA 

RIBULOSE BISPHOSPHATE 

CARBOXY 

477 

5 

5 

3.82 

0 

No al ignnents saved 

1. 









> 0 < 

0| |D IntelliGenetics 

> 0 < 

FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-15.res made by alexk on Thu 25 Feb 93 10?45?14-PST. 


Query sequence being compared? CELSA-15 (1-11) 

Number of sequences searched? 94553 

Number of scores above cutoff? 4356 

Results of the initial comparison of CELSA-15 (1-11) with? 
Data bank ? A-GeneSeq 9. all entries 
Data bank ? PIR 34? all entries 
Data bank ? Swiss-Prot 23* all entries 

100000 - 


* 

* 


S - * 

E 5000- 

Q 

U 

E 

N 

C 

E 

S 1000- 

* 

500- 


N 

U50000- 

M 

B 

E 

R 


0 

F10000- 




100 - * 

so¬ 


lo- 




Best Available Copy 


0 - 

II II I II 

SCORE 0| 1 j 2 | 2 

STDEV -1 0 1 


3 

p 



5 


4 


6 


PARAMETERS 


Similarity matrix Unitary 


Mismatch penalty 5 
Gap penalty 1 ,00 
Gap size penalty 0.26 
Cutoff score 0 
Randomization group 0 

Initial scores to save 50 
Optimized scores to save 0 


K-tup le 

Joining penalty 
W i ndoui size 


Alignments to save 
Display context 


20 

5 


0 

0 


SEARCH STATISTICS 


Score s: 


Times; 


Mean 

2 

CPU 

00 ; 03!32.04 


Number of residues; 

Number of sequences searched; 
Number of scores above cutoff; 


Median 
3 


'5433612 

94553 

4356 


Standard Deviation 
1.06 

Total Elapsed 
00 ; 07 : 11.00 


Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4 . 
Cut-off raised to 5. 


The scores below are sorted by initial score. 
Significance is calculated based on initial score. 


A 100/C identical sequence to the query sequence was not found. 




I 

7 


The list of best scores is; 


Sequence Name 

Description 

Length 

I n i t. 

S c o r e 

Opt. 
Score 

S i g. 

Frame 

1 . 

DP3B_STRC0 

4 standard deviations 
DNA POLYMERASE III, BETA CHAI 

above mean 

376 7 7 

4,71 

0 

2 . 

UL96_HCMVA 

*«•#■«• 3 standard deviations 

HYPOTHETICAL PROTEIN UL96. 

above mean #■**•*■» 

115 6 6 

3.77 

0 

3. 

S09S61 

Hypothetical protein UL96 - H 

1 15 

6 

A 

3.77 

0 

4 . 

Fh'BP_NEUCR 

FK506-BINDING PROTEIN (FKBP) 

120 

6 

6 

3.77 

0 

5 . 

SI 1090 

*FK506~binding protein - Neur 

120 

6 

6 

3.77 

0 

6 . 

PQ0185 

Tonoplast intrinsic protein b 

169 

6 

6 

3.77 

0 

7. 

P0L_SFV 1 

POL P0LYPR0TEIN (REVERSE TRAN 

193 

6 

~7 

/ 

3.77 

0 

S . 

A33562 

*pol polyprotein - Simian Foa 

193 

6 

7 

3.77 

0 

9. 

S05586 

Hypothetical protein a - Maiz 

216 

6 

6 

3.77 

0 

10 . 

A4137S 

■frHypothetical protein (cycB 5 

221 

6 

6 

3.77 

0 

1 1 . 

A05182 

Aleohoi dehydrogerase beta - 

243 

6 

6 

3.77 

0 

12 . 

A31087 

Alzheimer's disease amyloid h 

264 

6 

6 

3.77 

0 

13, 

F'90609 

Se q 14 ence o f amy 3 7 c I on e . 

£64 

6 

A 

3.77 

0 

14. 

P90497 

P r- r 1 t t=* i n c 4a .“i i i o »-“i r- c:> -> l ■ . r~i i 1 L-. 


/ 

, 







16. KPR2_HUMAN 

17. KPR1_HUMAN 
13. KIRTR1 

19. KIHUR1 

20. KIRTR2 

21. KIHUR2 

22. C0BD_PSEDE 

23. E36144 

24. R13495 

25. CELF _HSV2H 

26. JH0143 

27. YT44_STRFR 
23. JQ0430 

29. A30320 

30. A29030 

31. NCAP_DUGBV 


32. VHVUDU 

33. A32761 


34. 

GAG_HIV1U 

35 • 

GAG_HIV1N 

36. 

P91256 

37. 

0CTC_RAT 

33. 

A31948 

39. 

B30320 

40. 

UL32_HSV11 

41 . 

WMBEH2 

42. 

HEMA.MEASH 

43. 

HEMA_MEASE 

44. 

HMNZED 

45. 

HMNZHA 

46. 

HEMA_MEASY 

47. 

JU0273 

48. 

S18607 

49. 

B41080 

50. 

A4_RAT 


RIBOSE-PHOSPHATE KVKUrHuarnun 
RIBOSE-PHOSPHATE PYROPHOSPHOK 
Ribose-phosphate pyrophosphok 

Rib o s e-pho s p h aie pyrophosphok 
Riboss-phosphate pyrophosphok 
Ribcse-phosphate pyrophosphok 
COBD PROTEIN. 

■ftcobD protoiri - Pseudomonas s 
P . don i tr i f i c an s COE! D. 

CELL FUSION PROTEIN PRECURSOR 
Syri protein - Hunan herpesvir 
HYPOTHETICAL 44.4 KD PROTEIN 
Hypothetical 44.4K protein- 
#Alzheimer's disease amyloid 
Alzheimer's disease amyloid A 
NUCLEOCAPSID PROTEIN (NUCLEOP 
Nucleocapsid protein N - Dugb 
*A1zheimer’s disease amyloid 
GAG POLYPROTEIN ( CORE PROTEIN 
GAG POLYPROTEIN (CORE PROTEIN 
Recombinant gag precursor of 
CARNITINE OCTANOYLTRANSFERASE 
Carnitine octanoy1 transferase 
^Alzheimer's disease amyloid 
HYPOTHETICAL PROTEIN UL32. 
UL32 protein - Human herpesvi 

HEMAGGLUTININ-NEURAMINIDASE ( 
HEMAGGLUTININ-NEURAMINIDASE ( 
Hemagglutinin - Measles virus 
Hemagglutinin - Measles virus 
HEMAGGLUTININ-NEURAMINIDASE < 
Hemagglutinin - Measles virus 
frPhosphotransferase system en 
^Probable transketolase - Rho 
ALZHEIMER'S DISEASE AMYLOID A 


No alignments saved. 


317 
313 

318 
313 
318 
323 
323 
323 
338 
338 
395 
395 
412 
412 
441 
441 
434 
493 
500 
500 
523 
523 
57.4 
596 
596 
617 
617 
617 
617 
620 
620 
651 
657 
695 



6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 ‘ 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 

6 3.77 0 
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FastDB Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-16.res made by alexk on Thu 25 Feb 93 10;45:14-PST 


Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 


CELSA-16 (1-11) 
94553 
4223 


Results of the initial comparison of CELSA-16 (1-11) with: 

d bank l A — Ci r. Q n O -.11 — X - j „ _ 


Data bank 
Data bank 
Data bank 


A-GeneSeq 9, all entries 
PIR 34, all entries 
Swiss-Prot 23, all entries 


100000 - 

N 

U50000- 
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F10000- 
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Best Available Copy 


0 - 

II I I 

SCORE 0| 1 | 

STDEV -1 0 
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3 



I I 
4| 

3 


Similarity matrix 
Mismatch penalty 
Gap penalty 
Gap size penalty 
Cutoff score 
Randomization group 


PARAMETERS 


Unit ary 


1.00 

0.26 


0 

0 


K-tup l e 

Joining penalty 
Window size 


Initial scores to save 50 

Optimized scores to save 0 


Alignments to save 
Display context 


5 


SO 

5 


0 

0 


SEARCH STATISTICS 


Scores: Mean 

S 

Times: CPU 

00:03:26.11 

Number of residues* 

Number of sequences searched: 
Number of scores above cutoff: 

Cut - off raised to 2 . 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 


Median Standard Deviation 

3 1.08 

Total Elapsed 
00:07:10.00 

25433612 

94553 

4223 


The scores below are sorted by initial score. 

Significance is calculated based on initial score. 

A 1007. identical sequence to the query sequence was not found. 


6 


The list of best scores is: 


Sequence Name 


1 . 

PE25_NPVAC 

2 . 

NTRB VIBAL 

3. 

JL0114 

4 . 

GUN CELUD 

5. 

P70396 

6 . 

IL 1 S_HUMAN 

7. 

SI 7428 

a. 

R15614 

9. 

R12786 

10 . 

UL32_EBV 

11 . 

GQBE6 

12 . 

SPG7 DICDI 

13. 

B35621 

14 . 

VGLE H5V11 

15. 

VGBE1S 


Description 


I nit. Opt. 
Length Score Score 


###* 3 standard deviations 

25.1 KD PROTEIN IN PE-P26 INT 
NITROGEN REGULATION PROTEIN N 
ntrB protein - Vibrio alginol 
ENDOGLUCANASE PRECURSOR (EC 3 
Cellulase. 

INTERLEUKIN-1 RECEPTOR, TYPE 
*Interleukin-1 receptor, type 
Human type II interleukin-1 r 
Actinomycete Phospholipase D. 
HYPOTHETICAL PROTEIN BFLF1. 
BFLF1 protein - Human herpesv 
SPORE GERMINATION PROTEIN 270 


above mean 
219 
350 
350 
359 
359 
398 
398 
398 
524 



Spore germination protein 270 532 

GLYCOPROTEIN E PRECURSOR. 550 

fn 1 M {" r*t {**• ;-i + i C ™ ___ . — ^ 


6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 


6 

7 

7 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 


Sig. Frame 


3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 

3.71 0 






17. 

S19202 

*G1ycoprotein Ben precursor - 

baa 

6 

/ 

a. /1 

u 

18. 

JH0506 

Adhesion molecule SCI precurs 

588 

6 

7 

3.71 

0 

19. 

JH0593 

Schwann cell myelin protein p 

620 

6 

6 

3.71 

0 

20. 

VF'3_RDV 

MAJOR 114 KD STRUCTURAL PROTE 

1019 

6 

6 

3.71 

0 

21 . 

S12826 

*Hypothetical protein - Rice 

1019 

6 

6 

3.71 

0 

22. 

S12621 

#114K major core protein - Ri 

1019 

6 

6 

3.71 

0 

23. 

S20548 

*P-glycoprotein homolog - Yea 

1362 

6 

6 

3.71 

0 

24. 

KRE5_YEAST 

KRE5 PROTEIN .PRECURSOR. 

1365 

6 

6 

3.71 

0 

25. 

BVBYK5 

KRE5 protein precursor - Yeas 

1365 

6 

6 

3.71 

0 

26. 

A39401 

*Merozoite surface antigen 1 

1726 

6 

6 

3.71 

0 

27. 

TEGU_VZVD 

LARGE TEGUMENT PROTEIN. 

2763 

6 

6 

3.71 

0 

28. 

WZBE22 

Gene 22 protein - Human herpe 2763 

***■#■ 2 standard deviations above mean 

6 

6 

3.71 

0 

29. 

R12065 

Antigenic peptide #2269 D5/19 

15 

5 

5 

2.78 

0 

30. 

SI 1617 

Ribosomal protein HS17 - Halo 

30 

5 

5 

2.78 

0 

31 . 

S19945 

Protein kinase crk3 - Mouse ( 

35 

5 

5 

2.78 

0 

32. 

P91884 

Antigenic Epstein-Barr virus 

48 

5 

5 

2.78 

0 

33. 

S02383 

Probable membrane antigen CL3 

57 

5 

5 

2.78 

0 

34. 

C27606 

Ig heavy chain V-a region (p2 

58 

5 

5 

2.78 

0 

35. 

S22133 

ttHypothetica 1 protein - Esche 

76 

5 

5 

2.78 

0 

36. 

KPSJ_HUMAN 

PUTATIVE SERINE/THREONINE-PRO 

101 

5 

5 

2.78 

0 

37. 

C26368 

Protein-serine kinase (clone 

101 

5 

5 

2.78 

0 

38. 

NRAM_I ADA 1 

NEURAMINIDASE (EC 3.2.1.18) ( 

103 

5 

6 

2.78 

0 

39. 

CYT A_RAT 

CYSTATIN ALPHA (EPIDERMAL THI 

103 

5 

5' 

2.78 

0 

40. 

A00891 

Sialidase - Influenza A virus 

103 

5 

6 

2.78 

0 

41 . 

UDRTS2 

Cystatin alpha - Rat 

103 

5 

5 

2.78 

0 

42. 

HM20_XENLA 

HOMEOTIC PROTEIN NRL-20 (FRAG 

105 

5 

5 

2.78 

0 

43. 

S1 2 181 

ttXl-pou protein - African cla 

105 

5 

5 

2.78 

0 

44. 

S21S56 

♦Collagen alpha 1(X5 chain - 

106 

5 

5 

2.78 

0 

45. 

S1 5326 

♦Collagen - Human 

106 

5 

5 

2.78 

0 

46. 

KV6K_M0USE 

IG KAPPA CHAIN V-VI REGION (N 

108 

5 

5 

2.78 

0 

47. 

PRVA_RAJCL 

PARVALBUMIN ALPHA. 

109 

5 

5 

2.78 

0 

48. 

SI 1 125 

wig heavy chain V region - Mo 

109 

5 

5 

2.78 

0 

49. 

PVRYC 

Parvalbumin - Thornback ray 

109 

5 

5 

2.78 

0 

50. 

P70092 

Sequence encoded by (2’-5 r ) o 

110 

5 

6 

2.78 

0 


No alignments saved- 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file ce 1 s a-1 7 . res made by ale>:k on Thu 25 Feb 93 1 0 : 29 : 1 4-PST . 


Query sequence being compared; 
Number of sequences searched; 
Number of scores above cutoff; 

Results of 
Data bank 
Data bank 
Data bank 


CELSA-17 (1-13) 

94553 
4735 

of CELSA-17 (1-13) with: 


the initial comparison 
A—GeneSeq 9. all entries 
PIR 34 i all entries 
Swiss—Prot 23» all entries 


100000 - 


N 

U50000- 

M 

B 

E 

R 

* 

0 

F10000- 
S 

E 5000- 
Q 
U 
E 
N 
C 
.E 

S 1000- 


* 




* 




500- 


* 




100 - 


50- 


10 - 




Best Available Copy 


n mi 

SCORE 0| 1 1 2 

STDEV -1 0 

1 1 1 

1 3 | 

1 2 

1 II II 

4 | 4 |5 

3 4 


PARAMETERS 

Similarity matrix 
Mismatch penalty 

Gap penalty 

Gap size penalty 

Cutoff score 
Randomization group 

Unitary 

5 

1.00 
0.26 

0 

0 

K-tuple 

Joining penalty 
Window size 

Initial scores to save 50 

Optimi zed scores to save 0 

Alignments to save 
Display context 


SEARCH STATISTICS 


Scores: 

Mean 

£ 

Median 

"7 

Standard Deviation 
1.13 

Times : 

CPU 

00:03:37.03 


Total Elapsed 
00:07:11.00 

Number of 
Number of 
Number of 

residues: 

sequences searched: 
scores above cutoff: 

25433612 

94553 

47S5 


Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 




Cut-off raised to 5. 

Ti '10 scores below are sorted by initial score* 

Significance is calculated based on initial score. 

A 100'/. identical sequence to the query sequence was not found 


The list of best scores is: 


Sequence Name Description 


Init. Dpt. 

Length Score Score Sig. Frame 


1 . 

LV4A_HUMAN 

2 . 

L4HUBU 

3. 

HMN3_DRQME 

4. 

C33976 

5 ■ 

ATP 6 _STRPU 

6 . 

SO 1505 

7 . 

TRPD _BACSU 

a. 

B22794 

9- 

NPBS 

10 .. 

TRPD_BACPU 

1 1 . 

JH0099 

12 . 

A 31 0 9 3 


5 standard deviations above 
G LAMBDA CHAIN V-IV REGION < 
g lambda chain V-IV region ( 

4 standard deviations at 
I0ME0B0X PROTEIN NK-3 (FRAG ME 
■NK-3 homeotic protein - Frui 
iTP SYNTHASE A CHAIN (EC 3.6. 

| + — transporting ATP synthase 
UMTHRANILATE PHOSPHOR IBOSYLTR 
inthranilate phosphoribosyltr 
p h o s p h o r i b o s y 11 r 
PHOSPHORIBOSYLTR 
-- Bacillus pumil 
standard deviations' ai 
; p o r u 1 a t i o n p r o t e i n 
- .. , ... ... * 4 t aO-, - M 


mean 


inthran i 1 ate 
iNTHRAN I LATE 
,rpD protein 
3 

> ta qe IV 


106 

S 

8 

5.32 

106 

e mean 

8 

8 

5.32 

194 

7 

7 

4.43 

194 

7 

7 

4.43 

229 

7 

7 

4.43 

229 

7 

*7 

1 

4.43 

337 

7 

7 

4.43 

337 

7 

7 

4.43 

337 

7 

7 

4.43 

340 

7 

7 

4.43 

340 

.‘8 an 

7 

■# * # # 

7 

4.43 

tr 

4 3 

A 

£3 

,j - 0 u 

AO 

6 

£> 

3.35 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 





*W« L_ / x 


16. 

V7K_BYDVP 

17. 

S00951 

18. 

S15534 

19. 

S15533 

20. 

S13785 

21 . 

HM24 HUMAN 

22. 

B37042 

23. 

NU4M_ARTSX 

24. 

SO 1211 

25. 

HM24_CHICK 

26. 

S10884 

27. 

S 1 4975 

28. 

HM54 HUMAN 

29. 

B32830 

30. 

S05957 

31 . 

YENM_SALTY 

32. 

S15521 

33. 

S16177 

34. 

HMR4_RAT 

35. 

HMRA_RAT 

36. 

KV3J HUMAN 

37. 

K3HUVH 

38. 

R15106 

39. 

N0DN_RHIME 

40. 

S16566 

41 . 

I T1B_PS0TE 

42. 

IT1A_PSOTE 

43. 

TIWDKB 

44 . 

TIWDK 

45. 

FRI3 RANCA 

46. 

B27S05 

47. 

CRTZ_ERWUR 

48. 

F37S02 

49. 

R07463 

50. 

IT2_PS0TE 


No alignments saved 


v nox c m ** honeoti c protein - M 
6.7 KD PROTEIN (ORF 5). 
Hypothetical protein, 6.7K - 
Homeotic protein Hex 3A - Hum 
Homeotic protein Hex 2D - Hum 
♦Homeotic protein m31 - Mouse 
HOMEOBOX PROTEIN HOX-2.4 (HOX 
♦Homeotic protein Hox 2,4 - H 
NADH-UBIOUINONE OX IDOREDUCTAS 
NADH dehydrogenase (ubiquirmn 
HOMEOBOX PROTEIN CHOX-2.4 (FR 
♦Homeotic protein Hox 2.4 - C 
♦Extensin class II - Tomato 
HOMEOBOX PROTEIN HOX-5.4 (HOX 
♦Hox5.4 homeotic protein - Hu 
Homeotic protein Hox 4E - Hum 
HYPOTHETICAL 10.4 KD PROTEIN 
♦Homeotic protein Hox 4.3 - M 
♦Homeotic protein Hox 4.3 - M 
HOMEOBOX PROTEIN R4 (FRAGMENT 
HOMEOBOX PROTEIN R1A (FRAGMEN 
IG KAPPA CHAIN PRECURSOR V—I I 
Ig kappa chain precursor V-II 
hCG/bLH chimera, DIO 
NODULATI ON PROTEIN N. 

♦nolN protein - Rhizobium mel 
TRYPSIN INHIBITOR IB (WTI-1B) 
TRYPSIN INHIBITOR 1A (WTI-1A) 
Trypsin inhibitor IB (Kunitz) 
Trypsin inhibitor 1A (Kunitz) 
FERRITIN, LOWER SUBUNIT (FERR 
Ferritin chain L - Bullfrog 
CRTZ PROTEIN (EC 
♦crtZ protein - Erwini. a uredo 
Polypeptide u»ith enzymatic ac 
TRYPSIN INHIBITOR 2 (WTI-2). 


62 

63 

63 

66 

66 

67 

70 

70 

71 
71 
74 
74 
82 
95 
95 
95 
99 
99 
97 

104 

114 

116 

116 

145 

160 

160 

172 

172 

172 

172 

173 

174 

175 
175 
175 
182 


6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 


6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

8 3.55 

8 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 

7 3.55 

7 3.55 

6 3.55 

6 3.55 

6 3.55 

6 3.55 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 
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FastDB - Fast Pairwise Comparison of Sequences 
Release 5.4 

Results file celsa-18.res made by alexk on Thu 25 Feb 93 10:37 


Query sequence being compared: 
Number of sequences searched: 
Number of scores above cutoff: 


CELSA-18 (1-15) 
94553 
3965 


Results of the initial comparison of CELSA-18 


Data bank 
Data bank 
Data bank 


A-GeneSeq 9, all entries 
PIR 34 r all entries 
Swiss-Prot 23r all entries 


(1-15) 


with: 


100000 - 


U50000- 

M 

B 

E 

R 

* 

0 

F10000- 


S 

E 5000- 

Q * 

U 

E 

N 

C 

E 

S 1000- 


* 


500- 

* 


100 - 


50- 


«• 


10 - 


* 


-PST . 


* 





Best Available Copy 


II II 

1 1 

1 1 

II 1 ! H 

SCORE 0| 1 ! 

2 | 

3| 

4 | 5 | 6 | 

STDEV -1 0 

1 

2 

3 4 5 



PARAMETERS 

Similarity matrix 


Unitary 

K-tuple 

Mismatch penalty 


5 

Joining penalty 

Gap penalty 


1.00 

Window size 

Gap size penalty 


0.26 


Cutoff score 


0 


Randomization group 


0 


Initial scores to save 

50 

Alignments to save 

Optimized scores to 

s ave 0 

Display context 


SEARCH STATISTICS 


8 



20 

5 


0 

0 


Scores 


Mean 


Median 
3 


T icies : 


CPU 

00:03:29.06 


Number of residues: 

Number of sequences searched: 
Number of scores above cutoff: 


25433612 

94553 

3965 


Standard Deviation 

1.00 

Total Elapsed 
00:07:15.00 


Cut-off raised to 2. 

Cut-off raised to 3. 

Cut-off raised to 4. 

Cut-off raised to 5. 

The scores below are sorted by initial score. 

Significance is calculated based on initial score. 

A 1007. identical sequence to the query sequence was not found. 


The list of best scores is: 


Sequence Name Description Length 


Init. Opt. 
Score Score 


1 . 

2 . 

3 . 

4 . 

5 > 

6 , 

7, 

8, 


PPD1_B0VIN 
A25274 
A41179 
PCI_M0U5E 

pciIhuman 

A27410 

S21706 

A39216 


♦ ♦♦♦ 6 standard deviations 

PHOSPHODIESTERASE I (EC 3.1.4 
Phosphodiesterase I - Bovine 
♦Protein kinase PC—1 — Bovine 
PLASMA-CELL MEMBRANE GLYC0PR0 
PLASMA-CELL MEMBRANE GL.YC0PRD 
Plasma cell membrane protein 
♦Pyrophosphatase - Human 
♦Plasma cell membrane protein 
♦♦♦♦ 5 standard deviations 


above mean 
61 
61 
289 
871 
873 
905 


above mean 


9 . 

A 3 7 2 5 3 

Hypothetica 1 

protein. 

13K - M 

115 

10 . 

A28336 

Hypothetic a 1 

protein > 

13K - M 

115 

1 1 . 

A2749S 

Coagulat ion 

Fact o r V 

p r e c u r s o 

1600 

1 2 , 

FA5._HUMAN 

COAGULATION 

FACTOR V 

PRECURS0 

2224 

13. 

A 2 8 0 2 3 

Coagula tio n 

Fact.. o r V 

precurso 

2224 

, * 

-r- r- i l ifT !in 

1 A o r- c* Tcr-iiMCMT P P DTP 7 M _ 

2763 


♦♦♦* 

9 

9 

9 

9 

9 

9 

9 

Q 

♦ ♦♦♦ 


9 

9 

9 

9 

9 

9 

9 

9 

8 

O 


Sig. Frame 


6.98 0 

6.98 0 

6.98 0 

6.98 0 

6.98 0 

6.98 0 

6.98 0 

6.98 0 

5.93 0 

5.98 0 

5.98 0 

5.98 0 

5.93 0 

5.98 0 





16. 

B322.68 

17. 

PQ0328 

13. 

S0DC_HAEPA 

19 . 

sodc^haein 

20 . 

B41654 

21 . 

A41654 

22 . 

P33190 

23. 

P83189 

24. 

P40134 

25. 

P94665 

26. 

FUCP ECOLI 

27. 

WQECFP 

28. 

GSH1_EC0L1 

29. 

SYECEC 

30. 

TET0_CAMJE 

31 . 

A29809 


32. TETO_STRMU 

33. TETO_CAMCO 

34. S063S3 

35. A31098 

36. CR72_BACTI 

37. P91462 

33. DP0L_HPBV4 

39. DPOL~HPBVZ 

40. JDVLVH 

41. DPOM_HPBVY 

42. DPOL_HPBVY 

43. DPOL_HPBVA 

44. S20748 

45. S20757 

46. JDVLA1 

47. JDVLVB 

48. JDVLVA 

49. JDVLVS 

50. DPOL HPBVR 


■#•### 3 standard devi atioris a d 

•ftCarcinoenbryonic antigen pre 
Mucin-like peptide - Human (f 
SUPEROXIDE DISMUTASE PRECURSO 
SUPEROXIDE DISMUTASE LIKE PRO 
*Superoxide dismutase (Cu-Zn) 
,*Superoxide dismutase (Cu-Zn) 
[delta 1-5 DC Arg353 3 Alpha 1-a 
[A1a357, Arg353 3 Alpha 1-anti 
Sequence of human alpha—1-ant 
Human alpha-l-antitrypsin as 
L-FUCOSE PERMEASE. 

Fucose permease - Escherichia 
GLUTAMATE--CYSTEINE LIGASE (E 
Glutamate—cysteine ligase - 
TETRACYCLINE RESISTANCE PROTE 
Tetracycline resistance prote 
TETRACYCLINE RESISTANCE PROTE 
TETRACYCLINE RESISTANCE PROTE 
*Tetracycline resistance prot 
Tetracycline resistance prote 
72 KD CRYSTAL PROTEIN (DELTA 
67-kD protein toxin. . 

DNA POLYMERASE (EC 2.7.7.7). 
DNA POLYMERASE (EC 2.7.7.7) ( 

DNA-directed DNA polymerase - 
DNA POLYMERASE (EC 2.7.7.7) ( 

DNA POLYMERASE (EC 2.7.7.7) ( 

DNA POLYMERASE (EC 2.7.7.7). 
^Hypothetical protein (preSl» 
^Hypothetical protein (pre SI 
DNA-directed DNA polymerase - 
DNA-directed DNA polymerase - 
DNA-directed DNA polymerase - 
DNA-directed DNA polymerase - 
DNA POLYMERASE (EC 2.7.7.7). 


No alignments saved. 
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120 

6 

6 

3.99 

0 

141 

6 

6 

3.99 

0 

187 

6 

6 

3.99 

0 

187 

6 

6 

3.99 

0 

187 

6 

6 

3.99 

0 

187 

6 

6 

3.99 

0 

390 

6 

6 

3.99 

0 

395 

6 

6 

3.99 

0 

418 

6 

6 

3.99 

0 

418 

6 

6 

3.99 

0 

438 

6 

6 

3.99 

0 

438 

6 

6 

3.99 

0 

518 

6 

6 

3.99 

0 

518 

6 

6 

3.99 

0 

637 

6 

6 

3.99 

0 

637 

6 

6 

3.99 

0 

639 

6 

6 

3.99 

0 

639 

6 

6 

3.99 

0 

639 

6 

6 

3.99 

0 

639 

6 

6 

3.99 

0 

643 

6 

6 

3.99 

0 

643 

6 

6 

3.99 

0 

730 

6 

6 

3.99 

0 

750 

6 

6 

3.99 

0 

750 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

832 

6 

6 

3.99 

0 

842 

6 

6 

3.99 

0 

843 

6 

6 

3.99 

0 
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FastDB - Fast Pairuise Comparison of Sequences 
Release 5.4 

Results file celsa-19.res made by alexk on Thu 25 Feb 93 10?37?29-PST. 


Query sequence being compared? CEL3A-19 (1-22) 

Number of sequences searched? 94553 

Number of scores above cutoff? 4944 

Results of the initial comparison of CELSA—19 (1—22) uith? 

Data bank ? A-GeneSeq 9 r all entries 
Data bank ? PIR 34» all entries 
Data bank ? Suiss-Prot 23» all entries 

100000 - 

N 

U50000- * 

M 

B - * 

E 

R 

0 

F10000# * 

S 

E 5000- 
Q 
U 
E 

N * 

C 

E - * 

S 1000- 


500- 


* 


100 - 

50- 


* 


10 - 







Best Available Copy 


o- 

! 1 

1 1 1 

1 1 

1 1 

1 1 1 

1 

ill II 

SCORE 0| 

1 1 1 

3| 

|4 

1 6 | 

7 

| 9 | 10| 

STDEV -1 

0 1 

2 

3 

4 . 5 

6 

7 8 9 



PARAMETERS 

Similarity matrix 

Unitary 

K-tuple 

Mismatch penalty 

5 

Joining penalty 

Gap penalty 

1.00 

Window size 

Gap size penalty 

0.26 


Cutoff score 

0 


Randomization group 

0 


Initial scores to save 50 

Alignments to save 

Optimized scores to 

save 0 

Display context 


SEARCH STATISTICS 


12 


13 


2 

20 

5 


0 

0 


Scores : 


Mean 

Median 

Standard Deviation 



2 

3 

1.06 

Times! 


CPU 


Total Elapsed 



00:03:41.96 


00:07:24.00 

Number 

of 

residues: 

25433612 


Number 

of 

sequences searched! 

94553 


Number 

of 

scores above cutoff: 

4944 



Cut-off raised to 2. 
Cut-off raised to 3. 
Cut-off raised to 4. 
Cut-off raised to 5. 


The scores below are sorted by initial score. 

Significance is calculated based on initial score. 

A 100% identical sequence to the query sequence was not found. 


The list of best scores is! 

Init. Opt. 


Sequence Name 

Description 

Length See 

:• r o Score 

Sig. Frame 

1. PCI_HUMAN 

##■#■#■ 10 standard deviations 
PLASMA-CELL MEMBRANE GLYC0PR0 

above mean 
873 

13 

14 

10-41 

0 

2. S21706 

^Pyrophosphatase - Human 

925 

13 

14 

10.41 

0 

3. A39216 

^Plasma cell nenbr-ane protein 

925 

13 

14 

10.41 

0 

4. PC 1_M0USE 

#•*** 9 standard deviations 

PLASMA-CELL MEMBRANE GLYCOF'RO 

above mean 
371 

#### 

12 

13 

9.46 

0 

5. A27410 

Plasma cell membrane protein 

905 

12 

1 3 

9.46 

0 

6. S2072S 

5 standard deviations 
*Myosin light chain - Slime m 

above mean 
150 

8 

8 

5.68 

0 

7. MLE_DI CD I 

MYOSIN, ESSENTIAL LIGHT CHAIN 

166 

8 

8 

5.68 

0 

8. A 2 312 7 

Myosin light chain, nonmuscle 

166 

o 

8 

5.68 

0 

9. JQ0367 

Hypothetical protein 332, pic 

332 

8 

9 

5.68 

0 

10. PHA3 FREDI 

*### 4 standard deviations 

C-PHYC0CYANIM-3 ALPHA CHAIN. 

a ta o v e m e a n 
162 

# * # * 

7 

8 

4.73 

0 

11. S 0 5 712 

Phycocyanin 3 alpha chain - C 

162 

7 

8 

4.73 

0 

IP iji.-rr 

o o HTiri m a -i 

1 L. 

-7 

o 

.1 ~7 "7 

r\ 





14. CY1_NEUCR 

15. A27187 

16. JN0260 

17. YIFC_ECOL1 

18. PAT 1_SOLTU 

19. S05593 

20. PAT5_S0LTU 

21. PAT0_S0LTU 

22. A26017 

23. JS0624 

24. VGL2_CVMJH 

25. VGIHMJ 

26. VGL2_CVMA5 

27. VGIH59 

28. VGL2_CVM4 

29. VGIHJ2 

30. R13137 

31. THN6_H0RVU 

32. S00S25 

33. C551_ECTHA 

34. CCER51 

35. IMMC_ECQLI 

36. B 2 8 5 8 5 

37. IMECP3 

38. FKBP.NEUCR 

39. S11090 

40. THNL_H0RVU 

41. THN4_H0RVU 

42. S07648 

43. MLR.PHYPO 

44. S17769 

45. A29910 

46. S09470 

47. PHEA_MASLA 

48. PHCA_SYNPW 

49. PHA2~FREDI 

50. JQ0764 

No alignments sa 


CYTOCHROME Cl. HEME PKUlfciN r 
Cytochrome cl. heme protein p 
*Enterobacter i al common antig 
HYPOTHETICAL 39.6 KD IN RFE 3 
PATATIN B1 PRECURSOR (POTATO 
Patatin (clone pPATBl) - Pota 
PATATIN T5 PRECURSOR (POTATO 
PATATIN PRECURSOR (POTATO TUB 
Patatin T5 precursor - Potato 
Fatty acid beta oxidation mul 
E2 GLYCOPROTEIN PRECURSOR (SP 
E2 glycoprotein precursor - M 
E2 GLYCOPROTEIN PRECURSOR (SP 
E2 glycoprotein precursor - M 
E2 GLYCOPROTEIN PRECURSOR (SP 
E2 glycoprotein precursor - M 
####• 3 standard deviations 

GPIb alpha peptide fragment. 
LEAF-SPECIFIC THIONIN (CLONE 
Thionin BTH6. leaf — Barley 
CYTOCHROME C551. 

Cytochrome c551 - Ectothiorho 
IMMUNITY PROTEIN FOR CLOACIN. 


Immunity prote 
Immunity prote 
FK506-BINDING 
#FK506-bi riding 
LEAF-SPECIFIC 
LEAF-SPECIFIC 


in - Escherichi 
in - Escherichi 
PROTEIN (FKBP) 
protein - Ne u r 
THIONIN PRECURS 
THIONIN PRECURS 


o JC ' 

332 7 

348 7 

349 7 

377 7 

377 7 

386 7 

386 7 

386 7 

391 7 

1235 7 

1235 7 

1324 7 

1324 7 

1376 7 

1376 7 

above mean 

15 6 

46 6 

46 6 

78 6 

78 6 

35 6 

85 6 

35 6 

120 6 

120 6 

137 6 

137 6 

137 6 

147 6 

147 6 

147 6 

151 6 

162 6 

162 6 

162 6 

162 6 


7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 4.73 0 

7 3.79 0 

6 3.79 0 

6 3.79 0 

7 3.79 0 

7 3.79 0 

6 3.79 0 

6 3.79 0 

6 3.79 0 

6 3.79 0 

6 3.79 0 

7 3.79 0 

7 3.79 0 

7 3.79 0 

7 3.79 0 

6 3.79 0 

7 3.79 0 

8 3.79 0 

7 3.79 0 

7 3.79 0 

7 3.79 0 

7 3.79 0 


Thionin precursor > 
MYOSIN REGULATORY L 
-K-3-Dehydroquinase - 
Myosin calcium-bind 
*Intei—alpha-trypsi 
PHYCOERYTHROCYANIN 
C-PHYCOCYANIN ALPHA 
C-PHYCOCYANIN-2 ALP 
ftPhycoer ythroeyanin 


leaf (cion 
IGHT CHAIN 
Mycobacte 
ing light 
n inhibito 
ALPHA CHAI 
CHAIN. 

HA CHAIN, 
alpha cha 


/ed 



fl 



3. 5,003,047, Mar. £6, 1991, 

ligate; Martin L. Yarmush, et 
AVAILABLE! 


Method for purify! 
al. , 5G0/41 %i, 4l£, 


=> log y 
U. S. Patent 


& Trademark Office LOGOFF AT 


14:10: 





ng biologically active 
4£1, 4£7 ClMAGE 


£ ON £6 FEB 93 
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TABLE 4. Peptide sequences for Autotaxin 


10 


15 


20 


25 


PEPTIDE 

NO. 

amino acid 

SEQUENCE 

SEQ ID: 

NO: 

1 

UHVAAN 

SEQ ID N0:1 

1 . 

PXLDVYK 

SEQ ID NO:2 

c . 

3. 

4. 

5. 

c 

YPAFX 

SEQ ID NO:3 

QAEVS 

SEQ ID NO:4 

PEEVTXPNYL 

SEQ 10 NO:5 

Y0VPWNET1 

SEQ 10 N0:6 

o * 

7 

SPPFENINLY 

SEQ 10 NO:7 

t . 

8. 

9. 

ggqplwitatk 

SEQ ID NO:8 

VNSMQTVFVGY- 

SEQ 10 N0:9 


10 . 

11 . 



GPTFK 

01EHLTSLDFFR 

TEFLSNYITNVDD- 

ITLVPGTLGR 

QYLHQYGSS 

VLNYF 

YLNAT 

HLLYGRPAVLY 

SYPEILTPADN 

("'XYGFLFPPYISSSP 

TFPNLYVF/LAQGIYVS 

VNV1SGPIFDYDYDGLH/A- 

DTEDK 


SEQ 10 NO:10 
SEQ 10 NO:11 

SEQ 10 NO:26 
SEQ 10 NO:27 
SEQ ID NO:28 
SEQ 10 NO:29 
SEQ 10 NO:30 
SEQ ID NO:31 
SEQ 10 NO:32 
SEQ 10 NO:33 


NAME 

ATX 18 
ATX 19 
ATX 20 
ATX 24 
ATX 29 
ATX 47 
ATX 48 
ATX 100 
ATX 101 

ATX 102 
ATX 103 

ATX 37 
ATX 39 
ATX 40 
ATX 41 
ATX 44 
ATX 53 
ATX 59 
ATX 104 


... refer to p«.k, numbered in FIS. II. Peptide mmfcera 
Sr*..- from «. preparation *ioH pieided peptide n-p .-I 
U ond IP are from a eep.r.te par,fiction, not abo» ,n FIS. U. 


12-18 refer to 
. Peptides 8- 


x refers to potentially glycosylated residues. 



d his 


(FILE ’HOME’ ENTERED 01 13:34:£3 ON £6 FEB 93) 

FILE ’REGISTRY’ ENTERED AT 13:34:37 ON £6 FEB 93 
LI 0 S DIEHLTSLDFFR/SQEP 

L£ 0 S DIEHLTSLDFFR/SQSP 


=> log y 

COST IN U.S. DOLLARS 
FULL ESTIMATED COST 


SINCE FILE 
ENTRY 
SB. £0 


"I 01 AL 
SESS.ION 
£6. 46 


STN INTERNATIONAL LOGOFF AT 13:36:£4 ON £6 FEB 93 




